±ä¸üÈÕÖ¾
v1.2.0
OkhttpDownloaderÖ§³Ö´¦ÀícontentTypeÍ·ÖÐûÓÐÖ¸¶¨±àÂëµÄÖÐÎÄÒ³Ãæ
Ö§³Öͨ¹ý@Crawler×¢½âÖеÄhttpTimeOutÊôÐÔ×Ô¶¨ÒåhttpÇëÇóµÄ³¬Ê±Ê±¼ä£¬Ä¬ÈÏ15000ms
v1.1.0
¿Éͨ¹ýʵÏÖSeimiCrawlerµÄList
SemiQueueʵÏÖ°´Ðè¼ÓÔØ
ÐÞ¸´×¥È¡ÎļþÀàÐÍÊý¾Ý·µ»ØÊ±³¢ÊÔÆ¥Åämeta refreshʱ²úÉúµÄÎÊÌâ
¼ò½é
SeimiCrawlerÊÇÒ»¸öÃô½ÝµÄ£¬¶ÀÁ¢²¿ÊðµÄ£¬Ö§³Ö·Ö²¼Ê½µÄJavaÅÀ³æ¿ò¼Ü£¬Ï£ÍûÄÜÔÚ×î´ó³Ì¶ÈÉϽµµÍÐÂÊÖ¿ª·¢Ò»¸ö¿ÉÓÃÐÔ¸ßÇÒÐÔÄܲ»²îµÄÅÀ³æÏµÍ³µÄÃż÷£¬ÒÔ¼°ÌáÉý¿ª·¢ÅÀ³æÏµÍ³µÄ¿ª·¢Ð§ÂÊ¡£ÔÚSeimiCrawlerµÄÊÀ½çÀ¾ø´ó¶àÊýÈËÖ»Ðè¹ØÐÄȥдץȡµÄÒµÎñÂß¼¾Í¹»ÁË£¬ÆäÓàµÄSeimi°ïÄã¸ã¶¨¡£Éè¼ÆË¼ÏëÉÏSeimiCrawlerÊÜPythonµÄÅÀ³æ¿ò¼ÜScrapyÆô·¢£¬Í¬Ê±ÈÚºÏÁËJavaÓïÑÔ±¾ÉíÌØµãÓëSpringµÄÌØÐÔ£¬²¢Ï£ÍûÔÚ¹úÄÚ¸ü·½±ãÇÒÆÕ±éµÄʹÓøüÓÐЧÂʵÄXPath½âÎöHTML£¬ËùÒÔSeimiCrawlerĬÈϵÄHTML½âÎöÆ÷ÊÇJsoupXpath(¶ÀÁ¢À©Õ¹ÏîÄ¿£¬·Çjsoup×Ô´ø),ĬÈϽâÎöÌáÈ¡HTMLÊý¾Ý¹¤×÷¾ùʹÓÃXPathÀ´Íê³É£¨µ±È»£¬Êý¾Ý´¦ÀíÒà¿ÉÒÔ×ÔÐÐÑ¡ÔñÆäËû½âÎöÆ÷£©¡£²¢½áºÏSeimiAgent³¹µ×ÍêÃÀ½â¾ö¸´ÔÓ¶¯Ì¬Ò³ÃæäÖȾץȡÎÊÌâ¡£
Èí¼þÏêÇ飺http://seimicrawler.org/
ÏÂÔØµØÖ·£ºhttps://github.com/zhegexiaohuozi/SeimiCrawler
À´×Ô:¿ªÔ´ÖйúÉçÇø

