Sunspot:Solr возобновляет операцию индексации

#ruby-on-rails-4 #solr #sunspot-rails

#ruby-on-rails-4 #solr #sunspot-rails

Вопрос:

Я пытаюсь проиндексировать ~ 10 миллионов записей и в какой-то момент получил следующую ошибку:

 Net::ReadTimeout: Net::ReadTimeout
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/connection.rb:15:in `execute'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:181:in `execute'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:175:in `send_and_receive'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot_rails-2.2.5/lib/sunspot/rails/solr_instrumentation.rb:16:in `block in send_and_receive_with_as_instrumentation'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications.rb:159:in `block in instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications.rb:159:in `instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot_rails-2.2.5/lib/sunspot/rails/solr_instrumentation.rb:15:in `send_and_receive_with_as_instrumentation'
(eval):2:in `post'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:84:in `update'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:113:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot/session.rb:123:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot/session_proxy/abstract_session_proxy.rb:11:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot.rb:253:in `commit'
/home/deploy/webshop/releases/20161022173421/lib/tasks/sunspot.rake:10:in `block (3 levels) in <top (required)>'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activerecord-4.1.13/lib/active_record/relation/batches.rb:126:in `find_in_batches'
/home/deploy/webshop/releases/20161022173421/lib/tasks/sunspot.rake:7:in `block (2 levels) in <top (required)>'
IO::EAGAINWaitReadable: Resource temporarily unavailable - read would block
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/connection.rb:15:in `execute'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:181:in `execute'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:175:in `send_and_receive'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot_rails-2.2.5/lib/sunspot/rails/solr_instrumentation.rb:16:in `block in send_and_receive_with_as_instrumentation'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications.rb:159:in `block in instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activesupport-4.1.13/lib/active_support/notifications.rb:159:in `instrument'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot_rails-2.2.5/lib/sunspot/rails/solr_instrumentation.rb:15:in `send_and_receive_with_as_instrumentation'
(eval):2:in `post'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:84:in `update'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/rsolr-1.1.1/lib/rsolr/client.rb:113:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot/session.rb:123:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot/session_proxy/abstract_session_proxy.rb:11:in `commit'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/sunspot-2.2.5/lib/sunspot.rb:253:in `commit'
/home/deploy/webshop/releases/20161022173421/lib/tasks/sunspot.rake:10:in `block (3 levels) in <top (required)>'
/home/deploy/webshop/shared/bundle/ruby/2.1.0/gems/activerecord-4.1.13/lib/active_record/relation/batches.rb:126:in `find_in_batches'
/home/deploy/webshop/releases/20161022173421/lib/tasks/sunspot.rake:7:in `block (2 levels) in <top (required)>'
Tasks: TOP => sunspot:index
  

Логика переиндексации заключается в переборе всех активных элементов для добавления их в индекс:

 namespace :sunspot do
  desc "reindex items efficiently"
  task :index => :environment do
    puts "reindexing..."
    records = Item.active.count
    pbar = ProgressBar.new(records)
    Item.active.searchable_includes.find_in_batches do |items|
      items.each { |item| item.solr_index }
      pbar.increment! items.size
      Sunspot.commit
    end
    pbar.finish
  end
end
  

Вопрос: как я могу эффективно возобновить индексацию? Другими словами, как пропустить первые ~ 7 миллионов уже проиндексированных записей и продолжить работу с оставшимися ~ 3 миллионами?