Serialized strings differ between instances that implement Serializable and those that don't.
Instances that don't implement Serializable use the Object notation "O:" when serialized, while those that do use the Class notation "C:". Class notation can only be used to unserialize instances that implement Serializable, while the Object notation can be used to unserialize any object.
Because of this, it is sometimes useful to implement the __wakeup() function when implementing Serializable, for instances where you may have a copy of the serialised class before it implemented Serializable (backwards compatible), or when you're expecting a serialized object from an external source, and they use Object notation for maximum compatibility. You can also use __wakeup() to process your unserialize function, or use it to help prevent people trying to bypass your unserialize.
Below is an example of a simple class hierarchy, where A is a standard class, B implements Serializable, and C uses __wakeup() to assist with unserializing it.
<?php
class A {
protected $readonly_data = true;
public $public_data = true;
public function __construct( $data = true ) {
$this->public_data = $data;
}
public function get_readonly_data() {
return $this->readonly_data;
}
}
$a = new A;
var_dump( $a );
var_dump( serialize( $a ) );
?>
Class A outputs the following object, and its serialized string uses the object notation "O:". Please note that there is a null byte "\0" either side of the star*.
Changing the serialised string and unserializing it can cause protected and private values to change.
<?php
var_dump( unserialize( "O:1:\"A\":2:{s:16:\"\0*\0readonly_data\";b:0;s:11:\"public_data\";b:0;}" ) );
?>
Class B extends A, and so has the same constructor and properties. It also implements Serializable.
<?php
class B extends A implements Serializable {
public function serialize() {
return serialize( $this->public_data );
}
public function unserialize( $data ) {
$this->public_data = unserialize ( $data );
do_extra_processing_here();
}
}
$b = new B;
var_dump( serialize( $b ) );
?>
As well as being a lot shorter, the serialized string uses the Class notation "C:", but you can still unserialize it using the older style notation. Doing this however will completely ignore the unserialize() function, potentially update the wrong information, and the function do_extra_processing_here() from the example above is not called.
<?php
var_dump( unserialize( "O:1:\"B\":2:{s:16:\"\0*\0readonly_data\";b:0;s:11:\"public_data\";b:0;}" ) );
?>
Class C extends B, so it's already using the serialize() and unserialize() functions. By implementing the __wakeup() method, we ensure that we are validating the information and performing our do_extra_processing_here() function.
<?php
class C extends B {
public function __wakeup() {
$new = new static;
$this->readonly_data = $new->get_readonly_data();
do_extra_processing_here();
}
}
var_dump( unserialize( "O:1:\"C\":2:{s:16:\"\0*\0readonly_data\";b:0;s:11:\"public_data\";b:0;}" ) );
?>
We can use __wakeup() to revert our readonly data back to what it was, or to add additional processing. You can additionally call __wakeup() from within unserialize() if you need to do the same process regardless of which serialized string notation was used.