Indexer Partially Aborted

Discuss the Scalix Server software

Moderators: ScalixSupport, admin

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Indexer Partially Aborted

Postby les » Wed Mar 28, 2007 7:50 am

I have a fresh install 11.0.2.1 on Centos 4.4

i imported all users via sxmboximp.

we have a 22gb mail store, some 30 or more users.

after all users logged in the indexer kicked in. I am throttling it via
IDX_MAXLOAD=4
IDX_MINLOAD=1
in general.cfg so that it doesn't kill the box.

The indexer shows partially aborted after a period of time. The logs indicate that it gets stuck at a point after an hour or so.

I have already tried ulimit -n 10000, that didn't change things.
I now set it to an excessive value ulimit -n 50000
Am waiting to see what will happen.

Relevant logs are below. Can anyone shed some light on whats going on and how to fix it?

TIA


/var/opt/scalix/xx/tomcat/logs/scalix-sis-indexer.log shows

2007-03-28 19:45:40,100 INFO [main] [SISConfig.load:32] SIS initialized and ready for indexing
2007-03-28 19:46:01,641 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 46 refs, deleted 0 content, deleted 48 refs in 1360 ms
2007-03-28 20:54:09,515 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 98 refs, deleted 0 content, deleted 102 refs in 1232 ms
2007-03-28 20:54:12,159 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 100 refs, deleted 1 content, deleted 99 refs in 815 ms
2007-03-28 20:54:13,240 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 101 refs, deleted 0 content, deleted 101 refs in 1078 ms
2007-03-28 20:54:43,949 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 61 content, added 73 refs, deleted 0 content, deleted 20 refs in 4481 ms


/var/opt/scalix/xx//tomcat/logs/scalix-sis-search.log

2007-03-28 19:45:40,098 INFO [main] [SISConfig.load:31] SIS initialized and ready for searching
2007-03-28 20:34:56,051 ERROR [TP-Processor10] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:34:56,088 ERROR [TP-Processor10] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:35:03,440 ERROR [TP-Processor12] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:35:03,441 ERROR [TP-Processor12] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:35:06,359 ERROR [TP-Processor3] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:35:06,360 ERROR [TP-Processor3] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:35:09,491 ERROR [TP-Processor6] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:35:09,492 ERROR [TP-Processor6] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:35:14,233 ERROR [TP-Processor7] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:35:14,234 ERROR [TP-Processor7] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:35:18,530 ERROR [TP-Processor5] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:35:18,531 ERROR [TP-Processor5] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:47:17,039 ERROR [TP-Processor2] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:47:17,039 ERROR [TP-Processor2] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:47:19,981 ERROR [TP-Processor8] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:47:19,982 ERROR [TP-Processor8] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:47:23,520 ERROR [TP-Processor16] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:47:23,520 ERROR [TP-Processor16] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:47:25,881 ERROR [TP-Processor5] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:47:25,882 ERROR [TP-Processor5] [WebUtil.errorResponse:63] Error in search engine
2007-03-28 20:47:28,055 ERROR [TP-Processor11] [SearchEngine.search:51] Index for user 03300000afb09064-2.62.861.291 does not exist
2007-03-28 20:47:28,056 ERROR [TP-Processor11] [WebUtil.errorResponse:63] Error in search engine


/var/opt/scalix/xx/s/logs/fatal shows
SERIOUS ERROR Indexer (Indexer ) Wed Mar 28 20:06:13 2007
[OM 10270] Process about to terminate due to error.
Signal (Segmentation Violation) trapped by process 3482
Procedure trace follows:
-> handleMatureRequests
<- handleMatureRequests
-> deleteFinishedFiles
<- deleteFinishedFiles
-> readQueue
<- readQueue
-> handleMatureRequests
<- handleMatureRequests
-> deleteFinishedFiles
<- deleteFinishedFiles
-> readQueue
<- readQueue
-> handleMatureRequests
-> sis_connect
<- sis_connect
-> sis_index
Pid of logging process: 3482


SERIOUS ERROR Indexer (Indexer ) Wed Mar 28 20:06:13 2007
[OM 10272] BACKTRACE:
/opt/scalix/lib/libom_er.so(er_add_backtrace+0xc6)[0x853ee6]
/opt/scalix/lib/libom_er.so[0x8541e6]
/opt/scalix/lib/libom_er.so(er_DumpProcAndExit+0x1f)[0x85438f]
/lib/tls/libpthread.so.0[0x6f7898]
/lib/tls/libc.so.6(_IO_fwrite+0x13e)[0x4f6c6e]
/opt/scalix/lib/libom_mdc.so(mdc_fwrite+0x43)[0xd994c3]
/opt/scalix/lib/libom_sis.so(sis__do_post+0x1c5)[0xbb80e5]
/opt/scalix/lib/libom_sis.so(sis_index+0x10d)[0xbb716d]
indexer[0x804b1b6]
/lib/tls/libc.so.6(__libc_start_main+0xd3)[0x4b7de3]
indexer[0x8049721]
Pid of logging process: 3482

Regards,

Les Stott

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Wed Mar 28, 2007 8:49 am

changing the ulimit hasn't helped. I have put it back to normal. Search and Indexing service shows as partially aborted again.

I got some more info out of the scalix-sis-indexer.log this time....
2007-03-28 21:44:45,928 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 98 refs, deleted 0 content, deleted 102 refs in 10069 ms
2007-03-28 21:44:59,028 WARN [QueueManager] [BatchUpdater.performContentDeletes:409] No documents deleted for DeleteContentInfo{IndexInfo{userID='0d100000afb09064-2.62.861.291'}, indexID='21aa8c0-46090bfa-46094577-e41e'}
2007-03-28 21:45:01,123 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 99 refs, deleted 1 content, deleted 100 refs in 15040 ms
2007-03-28 21:45:01,490 WARN [TP-Processor14] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-4609458b-e6b8 in content index
2007-03-28 21:45:02,318 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 0 content, added 100 refs, deleted 0 content, deleted 100 refs in 1193 ms
2007-03-28 21:45:04,485 WARN [TP-Processor15] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-4609458b-e6b8 in content index
2007-03-28 21:45:04,511 WARN [TP-Processor6] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-4609458c-e6cd in content index
2007-03-28 21:45:04,514 WARN [TP-Processor12] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-4609458d-e6d0 in content index
2007-03-28 21:45:04,616 WARN [TP-Processor12] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-46094591-e6f4 in content index
2007-03-28 21:45:04,622 WARN [TP-Processor3] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-46094591-e6fa in content index
2007-03-28 21:45:04,628 WARN [TP-Processor14] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-46094592-e6fd in content index
2007-03-28 21:45:04,632 WARN [TP-Processor15] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-46094592-e700 in content index
2007-03-28 21:45:04,636 WARN [TP-Processor5] [IndexUtil.isIndexIDInContentIndex:211] There are 2 documents matching indexID 21aa8c0-46090bfa-46094592-e703 in content index
2007-03-28 21:45:17,672 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 22 content, added 94 refs, deleted 0 content, deleted 73 refs in 1483 ms
2007-03-28 21:45:35,329 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 12 content, added 13 refs, deleted 0 content, deleted 0 refs in 1896 ms
2007-03-28 21:51:49,814 INFO [QueueManager] [BatchUpdater.processMods:219] User 0d100000afb09064-2.62.861.291: added 1 content, added 1 refs, deleted 0 content, deleted 0 refs in 936 ms



the scalix-sis-search.,log has exactly the same errors as before, it loks like its getting stuck at a particular user.

Index for user 03300000afb09064-2.62.861.291 does not exist
How do i tell which user that maps to?
Regards,

Les Stott

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Wed Mar 28, 2007 9:08 am

Hi!

See if this helps:

omsearch -s -m @all-attr@ | grep 03300000afb09064

[Edit]
Once you are able to find a user, try running "omscan -Aavfx <username>" to see if the
message store is OK.

Thanks,
Subir

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Wed Mar 28, 2007 6:15 pm

ScalixSupport wrote:Hi!

See if this helps:

omsearch -s -m @all-attr@ | grep 03300000afb09064



It turns up nothing.......
Regards,

Les Stott

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Thu Mar 29, 2007 7:26 am

Hi Les!

This seems to me to be GLOBAL-UNIQUE-ID, that is assigned to each user, can you try
the command omsearch -s -m @all-attr@ -v | grep GLOBAL-UNIQUE-ID, see if this
results in any of the pattern that is similar to 03300000afb09064.

Thanks,
Subir

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Thu Mar 29, 2007 10:27 am

ScalixSupport wrote:Hi Les!

This seems to me to be GLOBAL-UNIQUE-ID, that is assigned to each user, can you try
the command omsearch -s -m @all-attr@ -v | grep GLOBAL-UNIQUE-ID, see if this
results in any of the pattern that is similar to 03300000afb09064.

Thanks,
Subir


Results of the command below....there is a 13300000afb064......thats as close as it gets. What next?

[root@server1 ~]# omsearch -s -m @all-attr@ -v | grep GLOBAL-UNIQUE-ID
GLOBAL-UNIQUE-ID=15000000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=09000000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=0b000000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=0d000000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=0f000000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=11100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=13100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=15100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=17100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=19100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1b100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1d100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1f100000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=11200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=13200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=15200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=17200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=19200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1b200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1d200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1f200000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=11300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=13300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=15300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=17300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=19300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1b300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1d300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1f300000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=11400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1d500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=13400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=15400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=17400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=19400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1b400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1d400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1f400000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=11500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=13500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=15500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=17500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=19500000afb09064-2.62.861.291
GLOBAL-UNIQUE-ID=1b500000afb09064-2.62.861.291
Regards,

Les Stott

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Thu Mar 29, 2007 11:09 am

les wrote:
ScalixSupport wrote:Hi Les!

This seems to me to be GLOBAL-UNIQUE-ID, that is assigned to each user, can you try
the command omsearch -s -m @all-attr@ -v | grep GLOBAL-UNIQUE-ID, see if this
results in any of the pattern that is similar to 03300000afb09064.

Thanks,
Subir


Results of the command below....there is a 13300000afb064......thats as close as it gets. What next?


I ran....omscan -A -f -x -U ptroughton
which matches the closest reference, it came back clean.
Regards,

Les Stott

paultt
Posts: 25
Joined: Tue Oct 17, 2006 10:48 pm
Location: Melbourne, Australia

Postby paultt » Thu Mar 29, 2007 6:23 pm

les wrote:
I ran....omscan -A -f -x -U ptroughton
which matches the closest reference, it came back clean.


I'm working on same server as Les. As Les said, stopping and restarting scalix leads to the indexer working for a little while then getting into a "partially aborted" state, in which there's no indexer process.

Stopping scalix, clearing the indexwork directory, and restarting scalix leads to the indexer coming up and staying up.

I've since tried running sxmkindex on a few individual users (including ptroughton), and it works correctly, with the indexer staying alive, and search working in SWA for those users once it has completed.

I suspect that there is a problem causing the indexer to abort when processing the mailbox of one of the users, just not one of the ones I've fed it manually.

I have two questions:

1. When you run sxmkindex without a username to index the whole message store, does it do the users in any particular order? Perhaps it is alphabetical?

2. Is there any way of restarting the indexer from its "partially aborted" state that's less disruptive to users than restarting all of scalix? omoff/omon do not work, as it's a "non-stop" process.

Many thanks, Paul.

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Thu Mar 29, 2007 7:40 pm

paultt wrote: 2. Is there any way of restarting the indexer from its "partially aborted" state that's less disruptive to users than restarting all of scalix? omoff/omon do not work, as it's a "non-stop" process.

les wrote:Hi Paul...i found this would do that....
stop the indexer
/opt/scalix/bin/omshutdm -s 35 -p -d 0
start the indexer
/opt/scalix/bin/indexer

Not sure why you cant do it with an omon or omoff -d 0 -s Indexer or similar....maybe someone can shed some light on that?
Regards,

Les Stott

paultt
Posts: 25
Joined: Tue Oct 17, 2006 10:48 pm
Location: Melbourne, Australia

Postby paultt » Thu Mar 29, 2007 8:08 pm

les wrote:Hi Paul...i found this would do that....
stop the indexer
/opt/scalix/bin/omshutdm -s 35 -p -d 0
start the indexer
/opt/scalix/bin/indexer

Not sure why you cant do it with an omon or omoff -d 0 -s Indexer or similar....maybe someone can shed some light on that?


Thank you -- where did you find out about that command? It's not in the admin guide, and doesn't have a man page!

Cheers, Paul.

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Fri Mar 30, 2007 2:12 am

Hi!

les wrote:
Hi Paul...i found this would do that....
stop the indexer
/opt/scalix/bin/omshutdm -s 35 -p -d 0
start the indexer
/opt/scalix/bin/indexer

Not sure why you cant do it with an omon or omoff -d 0 -s Indexer or similar....maybe someone can shed some light on that?

How did you come to know that the service number for indexer is 35.

Thanks,
Subir

les
Scalix Star
Scalix Star
Posts: 819
Joined: Thu Feb 23, 2006 10:18 am
Location: Sydney, Australia

Postby les » Fri Mar 30, 2007 5:11 am

ScalixSupport wrote:Hi!

les wrote:
Hi Paul...i found this would do that....
stop the indexer
/opt/scalix/bin/omshutdm -s 35 -p -d 0
start the indexer
/opt/scalix/bin/indexer

Not sure why you cant do it with an omon or omoff -d 0 -s Indexer or similar....maybe someone can shed some light on that?

How did you come to know that the service number for indexer is 35.

Thanks,
Subir


ok...trade secrets follow..... ;)

i listed out the details for all services with .....

omsetsvc -e

(you can also do omsetsvc -r Indexer just to get specifically the indexer results)

Then i found the details for startup and shutdown for the indexer service. I used those commands.

[root@mail ~]# omsetsvc -r Indexer
Details for subsystem Indexer:
Service Number = 35
Number of components = 2
Logging Level = 7
Audit logging Level = 0
Has an input queue? - NO
Show details from omstat? - YES
Subsystem can be enabled? - NO
Last state change (on/off) =
Last delayed off time =
[b]Startup prog name = ~/bin/indexer
Shutdown program name = ~/bin/omshutdm -s 35 -p -d %d
[/b]
Status program name =
PID's of subsystem processes: 3435 3436 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Nice Level = 0
Additional Resolve Flag = 0
Subsystem is controlled by 'all' - YES
Minimum temporary processes = 0
Maximum temporary processes = 0
Context dependent information: 0 0 0 0 0 0 0 0
Auxiliary processes = 0
PID's of auxiliary processes: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Regards,

Les
Regards,

Les Stott

ScalixSupport
Scalix
Scalix
Posts: 5503
Joined: Thu Mar 25, 2004 8:15 pm

Postby ScalixSupport » Fri Mar 30, 2007 7:55 am

Thanks a lot! Good information.

Warm Regards,
Subir


Return to “Scalix Server”



Who is online

Users browsing this forum: No registered users and 14 guests

cron