Tuesday, April 21, 2009

VGSA & SMB Crawl

It appears that there is an "Undocumented Software Feature" in the current VGSA effecting SMB crawls! For some reason if you specify just the folder root, you get back no results (VGSA apears not to do a dir listing); if you specify patern matches, i.e. *.pdf it treats this as absolute and if you specify files with .htm extensions you get an error in the Follow and Crawl.

The only way I have been able to get the VGSA to crawl SMB is to specify every file individually and even then I cannot crawl .htm files in file shares. I suspected that it could be the config on the Windows 2003 File Server, but I have pretty much discounted that. I think that there is an issue with the VGSA, I need to test against a real GSA to confirm this ... may have to wait a couple of weeks now as I am about out of time with everything else I have to do.

If I get any further with this, will post my findings here. If any of you have had a similar issue with VGSA, please get in touch and let me know.

UPDATE: Actually, this looks like it could be an issue with Windows 2003 Server ... needs further investigation ...

0 comments:

Post a Comment