-
Notifications
You must be signed in to change notification settings - Fork 87
Description
An error is emitted from the unserialize function and the meta value is missing from the database when importing a post which has serialized meta data containing CRLF new lines and the SimpleXML php extension is used.
unserialize(): Error at offset 185 of 185 bytes
wp-includes/functions.php:650
unserialize()
wp-includes/functions.php:650
maybe_unserialize()
wp-content/plugins/wordpress-importer/class-wp-import.php:891
WP_Import->process_posts()
wp-content/plugins/wordpress-importer/class-wp-import.php:89
WP_Import->import()
wp-content/plugins/wordpress-importer/class-wp-import.php:65
WP_Import->dispatch()
wp-admin/admin.php:364
It seems that the SimpleXML PHP extension has a bug that replaces the \r\n(0x0D 0x0A) characters with \n(0x0A). This reduces the length of the data, as the \r are removed. The size information of the string inside the serialized data is no longer correct and the unserialize function fails because of it.
Judging by the date of this Stack Overflow question this has been the case for at least 10 years.
https://stackoverflow.com/questions/27871572/php-simplexml-modifies-line-break-characters-in-cdata-elements
The import is successful when the SimpleXML extension is unavailable and the XML extension is used instead.
A quick way to test this without modifying the PHP configuration files is to modify the code temporarily in class-wxr-parser.php.
/* Line 15: */ if ( false && extension_loaded( 'simplexml' ) ) { ... }Steps to reproduce this bug
- Create a new WordPress installation.
- Run the script provided below with
wp eval-file filename.phpto create the sample posts. - Observe two rows being added to the
wp_postmetatable. - Create an export WXR file that contains these posts.
- Move the two posts to the trash and then delete them permanently.
- Observe the two rows vanishing from the
wp_postmetatable as the posts were deleted. - Import the WXR file that you just created. An error is emitted from the unserialize function.
- Observe that the meta data containing CRLF line endings is missing from the database. The meta data with LF line endings has been imported successfully.
Sample data creation script
<?php
if (!defined('WP_CLI')) die('NOT RUN FROM CLI');
if (!extension_loaded('simplexml')) {
WP_CLI::error('SimpleXML php extension is not loaded! The bug this issue is describing will not manifest!');
} else {
WP_CLI::log('SimpleXML php extension is available');
}
if (!extension_loaded('xml')) {
WP_CLI::log('XML php extension is not loaded.');
} else {
WP_CLI::log('XML php extension is available');
}
$post_crlf = wp_insert_post([
'post_author' => 1,
'post_title' => 'A post with serialized metadata - CRLF',
'post_status' => 'publish',
'meta_input' => [
'example_meta' => [
'boolean' => true,
'integer' => 42,
'html_markup' => "<h2>CRLF</h2>\r\nLorem ipsum dolor sit amet\r\n\r\nQuisque ligula eros ullamcorper quis, lacinia quis facilisis sed sapien."
]
]
], true);
if (is_wp_error($post_crlf)) {
WP_CLI::error('An error happened when inserting the CRLF post');
} else {
WP_CLI::success("Successfully inserted the CRLF post to the database with an ID: $post_crlf");
}
$post_lf = wp_insert_post([
'post_author' => 1,
'post_title' => 'A post with serialized metadata - LF',
'post_status' => 'publish',
'meta_input' => [
'example_meta' => [
'boolean' => true,
'integer' => 42,
'html_markup' => "<h2>LF</h2>\nLorem ipsum dolor sit amet\n\nQuisque ligula eros ullamcorper quis, lacinia quis facilisis sed sapien."
]
]
], true);
if (is_wp_error($post_lf)) {
WP_CLI::error('An error happened when inserting the LF post');
} else {
WP_CLI::success("Successfully inserted the LF post to the database with an ID: $post_lf");
}Environment information
PHP Version 8.3.16
WordPress 6.7.1
MySQL 8.0.33
Apple M1 Pro ARM64
MacOS Sonoma 14.7.2 (23H311)
Screenshots
The wp_postmeta table after the posts have been created with the script.

The wp_postmeta table after the posts were deleted and the WXR file was imported.
