— [EDM,GED,DMS] roadmap +1 : warmup@2016, 2D indexing comparo with the heavyweight opensource solutions#lucene out-of-my-box@2017 —

| | Documents [0.] | | Images [1.] | | Audios [2.] | | Videos [3.] | |||||||||||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| 00.txt | 01.odt | 02.doc | 03.pdf | 04.ods | 05.xls | 06.odp | 07.ppt | 08.odt | 09.pdf | 10.jpg | 11.gif | 12.png | 20.mp3 | 21.wav | 22.amr | 30.avi | 31.mp4 | 32.mkv |
`Context` data and cluster ready.
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Index [myDoc*.*]... Search [BigData|Big*]...
| | Documents [0.] | | Images [1.] | | Audios [2.] | | Videos [3.] | |||||||||||||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
||
| 00.txt | 01.odt | 02.doc | 03.pdf | 04.ods | 05.xls | 06.odp | 07.ppt | 08.odt | 09.pdf | 10.jpg | 11.gif | 12.png | 20.mp3 | 21.wav | 22.amr | 30.avi | 31.mp4 | 32.mkv | ||
| #3 | |
09/20 | ||||||||||||||||||
| #2 | |
10/20 | ||||||||||||||||||
| #1 | |
12/20 | ||||||||||||||||||
| Vertical limit | Horizontal indexing end | |||||||||||||||||||
| Vertical and horizontal : 2 dimensions to improve the indexing surface | ||||||||||||||||||||
| 12/20 | ||||
| 10/20 | ![]() |
|||
| 09/20 | ![]() |
"http://jbd-vm01.jbdata.fr:8983/solr/myCollec-0/select?indent=on&q=Big*&fl=id,a_s,a_i,a_f&sort=a_f asc,a_i asc&rows=100&wt=json"
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":7,
"params":{
"q":"Big*",
"indent":"on",
"fl":"id,a_s,a_i,a_f",
"sort":"a_f asc,a_i asc",
"rows":"100",
"wt":"json"}},
"response":{"numFound":9,"start":0,"docs":[
{
"id":".../dev/ged-06/input-20/myDoc-00.txt"},
{
"id":".../dev/ged-06/input-20/myDoc-01.odt"},
{
"id":".../dev/ged-06/input-20/myDoc-02.doc"},
{
"id":".../dev/ged-06/input-20/myDoc-03.pdf"},
{
"id":".../dev/ged-06/input-20/myDoc-04.ods"},
{
"id":".../dev/ged-06/input-20/myDoc-05.xls"},
{
"id":".../dev/ged-06/input-20/myDoc-06.odp"},
{
"id":".../dev/ged-06/input-20/myDoc-07.ppt"},
{
"id":".../dev/ged-06/input-20/myDoc-10.jpg"}]
}}
"http://jbd-vm01.jbdata.fr:9200/mydocs-idx/doc/_search?pretty" -d '{
"query": {
"bool": {
"must": [
{
"match" : { "content" : "BigData" }
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 50,
"sort": [],
"aggs": {}
}'
{
"took" : 23,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 10,
".../dev/ged-06/input-20/myDoc-07.ppt"
".../dev/ged-06/input-20/myDoc-01.odt"
".../dev/ged-06/input-20/myDoc-00.txt"
".../dev/ged-06/input-20/myDoc-10.jpg"
".../dev/ged-06/input-20/myDoc-05.xls"
".../dev/ged-06/input-20/myDoc-12.png"
".../dev/ged-06/input-20/myDoc-03.pdf"
".../dev/ged-06/input-20/myDoc-02.doc"
".../dev/ged-06/input-20/myDoc-06.odp"
".../dev/ged-06/input-20/myDoc-04.ods"
java org.apache.lucene.demo.SearchFiles -index .../dev/ged-06/.lucene -query "big*" Searching for: big* 12 total matching documents 1. .../dev/ged-06/output-20/myDoc-05.txt 2. .../dev/ged-06/output-20/myDoc-03.txt 3. .../dev/ged-06/output-20/myDoc-21.txt 4. .../dev/ged-06/output-20/myDoc-20.txt 5. .../dev/ged-06/output-20/myDoc-00.txt 6. .../dev/ged-06/output-20/myDoc-07.txt 7. .../dev/ged-06/output-20/myDoc-02.txt 8. .../dev/ged-06/output-20/myDoc-04.txt 9. .../dev/ged-06/output-20/myDoc-12.txt 10. .../dev/ged-06/output-20/myDoc-06.txt Press (n)ext page, (q)uit or enter number to jump to a page. n 11. .../dev/ged-06/output-20/myDoc-01.txt 12. .../dev/ged-06/output-20/myDoc-10.txt
Tika 1.15 upgrade and tunning to increase indexing surface : ( vertical + 2 ) * ( horizontal - 1 ) = 13.
| | Documents [0.] | | Images [1.] | | Audios [2.] | | Videos [3.] | |||||||||||||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
||
| 00.txt | 01.odt | 02.doc | 03.pdf | 04.ods | 05.xls | 06.odp | 07.ppt | 08.odt | 09.pdf | 10.jpg | 11.gif | 12.png | 20.mp3 | 21.wav | 22.amr | 30.avi | 31.mp4 | 32.mkv | ||
| #3 | 09/20 | |||||||||||||||||||
| #2 | 10/20 | |||||||||||||||||||
| #1 | 13/20 | |||||||||||||||||||