Poster:
|
stbalbach |
Date:
|
December 16, 2010 08:33:16am |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Full-text search seems OK. There are two ways:
1. Click on "Read Online" which brings up the book reader and search there.
2. Click on "Full Text" which is an ascii text file and search using the browser search function.
Poster:
|
dgec3n |
Date:
|
December 18, 2010 08:13:40am |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Since the change of book reader I have been experiencing exactly the same problem with full text search in every book I have accessed on archive.org. Words found using the old reader are just no longer being found with the new one. Is there any way round this other than using plain text search (or Project Gutenberg where available)? Is there any way that I can go back to using the older but more reliable reader?
Poster:
|
marineK11 |
Date:
|
December 16, 2010 08:45:29am |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Thank you for your answer.
Please have a look at the texts im working with:
http://www.archive.org/stream/briefederherzog00orlgoog#page/n8/mode/1upType in the name 'Magercroon' and you won't get any results - but this name is mentioned on page 115 (document number 123) and 152 (document number 162).
I used full text search in this document very often and I am quite sure that it worked much better before the site got the new design.
Can this be possible?
Poster:
|
stbalbach |
Date:
|
December 16, 2010 10:07:54am |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Ok I may see the problem. Here is the book in question:
http://www.archive.org/details/briefederherzog00hollgoogIt is originally from Google, and thus the OCR is poor quality. You can see it by clicking on "Full Text", then do a basic text search with the browser search function, the word your looking for "Magercroon" is not there because it was not properly OCR'd. The search function in book reader (ie. "Read Online" link) is using that text file as its source for searching, I believe.
If search still doesn't work, even though the word can be found in the text file, you could report a bug to the Book Reader developers:
http://openlibrary.org/dev/docs/bookreader
Poster:
|
rkumar |
Date:
|
December 20, 2010 11:25:39am |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Thanks for the explanation!
To add a bit more: Full-text search in the bookreader is now backed by a real search engine (we use solr). Previously, we were using a regex to search through a xml file, which had a lot of problems.
With the new search, you can now search for a phrase (in quotes). Phrase searching works across page boundaries. Hyphenated words now show up in search results.
Also, you can now do full-text search across 2+ million books at once! Give it a shot:
http://openlibrary.org/search/inside
Poster:
|
marineK11 |
Date:
|
December 16, 2010 12:50:16pm |
Forum:
|
texts
|
Subject:
|
Re: full text search |
Thank you very much again!!! Your explanation was a huge help. I understood the problem and searched for a version of the book which has a correct OCR.