Apache Tika 1.7 ·¢²¼ÁË£¬Tika ÊÇÒ»¸öÄÚÈݳéÈ¡µÄ¹¤¾ß¼¯ºÏ(a toolkit for text extracting)¡£Ëü¼¯³ÉÁË POI ºÍ Pdfbox£¬²¢ÇÒΪÎı¾³éÈ¡¹¤×÷ÌṩÁËÒ»¸öͳһµÄ½çÃæ¡£Æä´Î£¬Tika Ò²ÌṩÁ˱ãÀûµÄÀ©Õ¹ API£¬ÓÃÀ´·á¸»Æä¶ÔµÚÈý·½Îļþ¸ñʽµÄÖ§³Ö¡£
Apache Tika 1.17°üº¬Ðí¶à¸Ä½øºÍ´íÎóÐÞ¸´¡£
Fix thread-safety in ChmExtractor (TIKA-2519).
Upgrade cxf to 3.0.16 (TIKA-2516).
Allow users to configure maxMainMemoryBytes for PDFs via shrike (PR-213).
Extract underline and strikethrough in docx (TIKA-2347 and TIKA-2512).
Cache TikaConfig in EmbeddedDocumentUtil for better performance in documents with large number of attachments (TIKA-2511).
Extract media files from ooxml (TIKA-2510).
Standardize the way the Image and Video captioning dockers and extraction work (TIKA-2400, GitHub-208)
Upgrade to xmpcore 5.1.3 (TIKA-2034).
Upgrade to metadata-extractor 2.10.1 (TIKA-2486).
Upgrade to OpenNLP 1.8.3 (TIKA-2502).
Upgrade to Jackson 2.9.2 (TIKA-2501).
Èí¼þÏêÇ飺http://www.apache.org/dist/tika/CHANGES-1.17.txt
ÏÂÔصØÖ·£ºhttp://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.17-src.zip
À´×Ô:¿ªÔ´ÖйúÉçÇø