CRM Hosting


Add to Technorati Favorites

Extract text from .rtf and .doc files with PHP and COM

Ever wanted to extract text from a Microsoft Word document with PHP and COM? It’s not so hard as it seems. Create a Word document on your computer and use this piece of code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<?php
$word = new COM("word.application") or die("Unable to instantiate application object");
$name='c:\\TEST.doc';
	try {
 
		$word->Documents->Open("$name");
		$content = (string) $word->ActiveDocument->Content;
		$word->ActiveDocument->Close(false);
	}
	catch(Exception $e){
		$content="Error!";
	}
	echo $content;
?>

You can even parse a folder and get text from every .doc file in the folder.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

AddThis Social Bookmark Button
Comments are DoFollow, so you may consider writing a small note :)

One Response to “Extract text from .rtf and .doc files with PHP and COM”

  1. Thanks for this info!

Leave a Reply



BRDTracker