RE: Question on web search techniques -Reply

From: Kaiser, Rita (rita_kaiser@McKennaCuneo.com)
Date: Mon Aug 03 1998 - 07:21:10 PDT


The fact that general search engines cannot find the information in
databases on a website that are contained below common gateway
interfaces is something I warn my attorneys about. Since these
databases are often written in a form of C+ or PERL, one must search at
the site.

If indeed, the form you were looking for was in pdf, most search engines
can't get to it either. There are a few search engines residing on some
WebPages that can search pdf, but they are awkward to use. One such
site is the TRICARE site at http://www.ochampus.mil The search engine
there can search the Adobe manuals, but it is one of the few I have seen
that can.

This is a real problem - and how about the one where you try to download
a huge Adobe document and then can't see it? The techs at the State
dept told me it was my version of Adobe that was causing the problem.
Instead, it was the integration of Microsoft's Internet Explorer and the
Adobe Reader - I had to put the document into a completely different
directory before I could look at it.

_____________________________
Rita Kaiser
Manager, Library Services
McKenna & Cuneo, LLP
1900 K Street, NW
Washington, DC 20006
202-496-7752
Fax: 202-496-7756
Email: Rita_Kaiser@McKennaCuneo.com
   

> -----Original Message-----
> From: Christopher Carr [SMTP:carr@howdy.com]
> Sent: Monday, August 03, 1998 9:51 AM
> To: wng1@compuserve.com; law-lib@ucdavis.edu
> Subject: Question on web search techniques -Reply
>
> There is no satisfactory method of finding
> things like this. Your method will seem
> comic to future generations, but for today,
> there are few substitutes. The only
> alternative I can suggest is contacting the
> webmaster of a site that you think a likely
> repository for the information you seek.
>
> Karen Mahnk is correct that it is usually
> impractical for third-party search engines
> to index a database stored in a cgi
> directory. Typically, the data is stored in
> a form unreadable to a web crawler-style
> search engine. This problem is somewhat
> mitigated by the fact that many cgi
> directories can be search through a local
> search routine. If it is a database, as we
> are predicating here, then it must be
> searchable locally. The trick, of course, is
> in first identifying the site as a likely
> location for the information you seek.
>
> However, it does not appear that your
> particular problem relates to cgi. More
> likely, the search engines you used cannot
> index pdf files. I don't know whether any of
> the general-purpose web search engines index
> pdf files, but I'm sure that most of them do
> not. Anyone?
>
> Christopher Carr (speaking for myself)
> Library Services Manager
> Howard, Smith & Levin LLP
> 1330 Avenue of the Americas
> New York NY 10019
> 212 841 1085
>
> >>> Wendy Ng <wng1@compuserve.com> 07/31/98
> 05:32pm >>>
> Although I finally stumbled onto what I
> needed after a while(and I really
> mean "stumbled"), I would appreciate some
> search advice. My question is,
> how do you structure a search/search for
> something that seems to have no
> links to it?
>
> I was looking for a form from the Federal
> Trade Commission, form C4
> Notification and Report Form for Certain
> Mergers and Acquisitions.
> Naturally I searched the FTC website but
> didn't find it. Then tried some
> sites with links to forms and even search
> engines, still no luck. Finally
> looped back to an FTC page. I stared at the
> URL,
> http://www.ftc.gov/bc/docs/ blah>.htm,
> wondering maybe if I delete
> that <blah blah>.htm it would lead to a page
> with documents. Sure enough,
> it's an "index of /bc/docs", looked just
> like a typical ftp directory and
> the file formc4.pdf was sitting right there.
>
>
> After I printed the form(which made the
> attorney happy), I tried the FTC
> search engine again with the terms c4,
> formc4.pdf, pdf and still couldn't
> retrieve the form nor could I find a link to
> this index of /bc/docs. I am
> sure there are other sites like this where
> the information is there but you
> can't seem to find it unless you know the
> exact URL.
>
> Any tips? Explanations? Or did I just use
> the incorrect search terms?
>
> Still scratching my head,
>
> Wendy Ng, Librarian
> Spengler Nathanson PLL
> 608 Madison Ave Ste 1000
> Toledo OH 43604-1169
> phone: 419-252-6245
> fax: 419-241-8599
> email:
wng1@compuserve.com



This archive was generated by hypermail 2b29 : Wed Nov 14 2007 - 20:49:57 PST