I recently worked on migrating a site to a different server and for one reason or another, some of the images did not come over properly. While I could have just re-downloaded and re-imported all of the media, it would have taken quite a while since the media library was well over 100Gb. Instead, I opted to use WP-CLI to help find what images were missing:
/** * Iterate over attachments and check to see if they actually exist. * * @subcommand validate-attachments * @synopsis --output=<csv-filename> [--log-found] */ public function validate_attachments( $args, $assoc_args ) { $attachment_count = array_sum( (array) wp_count_posts( 'attachment' ) ); if ( isset( $args['log-found'] ) ) { $log_found = true; } else { $log_found = false; } $output_file = $assoc_args['output']; $posts_per_page = 500; $paged = 1; $count = 0; $output = array(); $progress = \WP_CLI\Utils\make_progress_bar( 'Checking ' . number_format( $attachment_count ) . ' attachments', $attachment_count ); $file_descriptor = fopen( $output_file, 'w' ); do { $attachments = get_posts( array( 'post_type' => 'attachment', 'posts_per_page' => $posts_per_page, 'paged' => $paged, ) ); foreach ( $attachments as $attachment ) { $url = $attachment->guid; $request = wp_remote_head( $url ); if ( 200 !== $request['response']['code'] ) { $output[] = array( $url, $request['response']['code'], $request['response']['message'], ); } else { if ( $log_found ) { $output[] = array( $url, $request['response']['code'], $request['response']['message'], ); } } $progress->tick(); $count++; } // Pause. sleep( 1 ); $paged++; } while ( count( $attachments ) ); $progress->finish(); WP_CLI\Utils\write_csv( $file_descriptor, $output ); fclose( $file_descriptor ); }
The benefit to this will be that I can just take the CSV, grab the URLs out of it, replace the domain name, and wget
just what I need.
It was also the firs time I’ve used WP-CLI’s write_csv()
function, which gave me a short pause since it’s not very well documented.