SlideShare ist ein Scribd-Unternehmen logo
1 von 125
Downloaden Sie, um offline zu lesen
Distributed
Ruby and Rails
      @ihower
   http://ihower.tw
        2010/1
About Me
•           a.k.a. ihower
    • http://ihower.tw
    • http://twitter.com/ihower
    • http://github.com/ihower
• Ruby on Rails Developer since 2006
• Ruby Taiwan Community
 • http://ruby.tw
Agenda
• Distributed Ruby
• Distributed Message Queues
• Background-processing in Rails
• Message Queues for Rails
• SOA for Rails
• Distributed Filesystem
• Distributed database
1.Distributed Ruby

• DRb
• Rinda
• Starfish
• MapReduce
• MagLev VM
DRb

• Ruby's RMI                 system
             (remote method invocation)


• an object in one Ruby process can invoke
  methods on an object in another Ruby
  process on the same or a different machine
DRb (cont.)

• no defined interface, faster development time
• tightly couple applications, because no
  defined API, but rather method on objects
• unreliable under large-scale, heavy loads
  production environments
server example 1
require 'drb'

class HelloWorldServer

      def say_hello
          'Hello, world!'
      end

end

DRb.start_service("druby://127.0.0.1:61676",
HelloWorldServer.new)
DRb.thread.join
client example 1
require 'drb'

server = DRbObject.new_with_uri("druby://127.0.0.1:61676")

puts server.say_hello
puts server.inspect

# Hello, world!
# <DRb::DRbObject:0x1003c04c8 @ref=nil, @uri="druby://
127.0.0.1:61676">
example 2
# user.rb
class User

  attr_accessor :username

end
server example 2
require 'drb'
require 'user'

class UserServer

  attr_accessor :users

  def find(id)
    self.users[id-1]
  end

end

user_server = UserServer.new
user_server.users = []
5.times do |i|
  user = User.new
  user.username = i + 1
  user_server.users << user
end

DRb.start_service("druby://127.0.0.1:61676", user_server)
DRb.thread.join
client example 2
require 'drb'

user_server = DRbObject.new_with_uri("druby://127.0.0.1:61676")

user = user_server.find(2)

puts user.inspect
puts "Username: #{user.username}"
user.name = "ihower"
puts "Username: #{user.username}"
Err...

# <DRb::DRbUnknown:0x1003b8318 @name="User", @buf="004bo:
tUser006:016@usernameia">
# client2.rb:8: undefined method `username' for
#<DRb::DRbUnknown:0x1003b8318> (NoMethodError)
Why? DRbUndumped
•   Default DRb operation

    • Pass by value
    • Must share code
• With DRbUndumped
 • Pass by reference
 • No need to share code
Example 2 Fixed
# user.rb
class User

  include DRbUndumped

  attr_accessor :username

end

# <DRb::DRbObject:0x1003b84f8 @ref=2149433940,
@uri="druby://127.0.0.1:61676">
# Username: 2
# Username: ihower
Why use DRbUndumped?

 • Big objects
 • Singleton objects
 • Lightweight clients
 • Rapidly changing software
ID conversion
• Converts reference into DRb object on server
 • DRbIdConv (Default)
 • TimerIdConv
 • NamedIdConv
 • GWIdConv
Beware of garbage
         collection
•   referenced objects may be collected on
    server (usually doesn't matter)
•   Building Your own ID Converter if you want
    to control persistent state.
DRb security
require 'drb'

ro = DRbObject.new_with_uri("druby://127.0.0.1:61676")
class << ro
    undef :instance_eval
end

# !!!!!!!! WARNING !!!!!!!!! DO NOT RUN
ro.instance_eval("`rm -rf *`")
$SAFE=1

instance_eval':
Insecure operation - instance_eval (SecurityError)
DRb security (cont.)

• Access Control Lists (ACLs)
 • via IP address array
 • still can run denial-of-service attack
• DRb over SSL
Rinda

• Rinda is a Ruby port of Linda distributed
    computing paradigm.
•   Linda is a model of coordination and communication among several parallel processes
    operating upon objects stored in and retrieved from shared, virtual, associative memory. This
    model is implemented as a "coordination language" in which several primitives operating on
    ordered sequence of typed data objects, "tuples," are added to a sequential language, such
    as C, and a logically global associative memory, called a tuplespace, in which processes
    store and retrieve tuples. (WikiPedia)
Rinda (cont.)

• Rinda consists of:
 • a TupleSpace implementation
 • a RingServer that allows DRb services to
    automatically discover each other.
RingServer

• We hardcoded IP addresses in DRb
  program, it’s tight coupling of applications
  and make fault tolerance difficult.
• RingServer can detect and interact with
  other services on the network without
  knowing IP addresses.
1. Where Service X?

                                                          RingServer
                                                          via broadcast UDP
                                                                address
                2. Service X: 192.168.1.12




  Client
@192.1681.100

                        3. Hi, Service X @ 192.168.1.12



                                                           Service X
                                                           @ 192.168.1.12
                  4. Hi There 192.168.1.100
ring server example
require 'rinda/ring'
require 'rinda/tuplespace'

DRb.start_service
Rinda::RingServer.new(Rinda::TupleSpace.new)
DRb.thread.join
service example
require 'rinda/ring'

class HelloWorldServer
    include DRbUndumped # Need for RingServer

      def say_hello
          'Hello, world!'
      end

end

DRb.start_service
ring_server = Rinda::RingFinger.primary
ring_server.write([:hello_world_service, :HelloWorldServer, HelloWorldServer.new,
                   'I like to say hi!'], Rinda::SimpleRenewer.new)

DRb.thread.join
client example
require 'rinda/ring'

DRb.start_service
ring_server = Rinda::RingFinger.primary

service = ring_server.read([:hello_world_service, nil,nil,nil])
server = service[2]

puts server.say_hello
puts service.inspect

# Hello, world!
# [:hello_world_service, :HelloWorldServer, #<DRb::DRbObject:0x10039b650
@uri="druby://fe80::21b:63ff:fec9:335f%en1:57416", @ref=2149388540>, "I like
to say hi!"]
TupleSpaces

• Shared object space
• Atomic access
• Just like bulletin board
• Tuple template is
  [:name, :Class, object, ‘description’ ]
5 Basic Operations

• write
• read
• take (Atomic Read+Delete)
• read_all
• notify (Callback for write/take/delete)
Starfish

• Starfish is a utility to make distributed
  programming ridiculously easy
• It runs both the server and the client in
  infinite loops
• MapReduce with ActiveRecode or Files
starfish foo.rb
# foo.rb

class Foo
  attr_reader :i

  def initialize
    @i = 0
  end

  def inc
    logger.info "YAY it incremented by 1 up to #{@i}"
    @i += 1
  end
end

server :log => "foo.log" do |object|
  object = Foo.new
end

client do |object|
  object.inc
end
starfish server example
   ARGV.unshift('server.rb')

   require 'rubygems'
   require 'starfish'

   class HelloWorld
     def say_hi
       'Hi There'
     end
   end

   Starfish.server = lambda do |object|
       object = HelloWorld.new
   end

   Starfish.new('hello_world').server
starfish client example
   ARGV.unshift('client.rb')

   require 'rubygems'
   require 'starfish'

   Starfish.client = lambda do |object|
     puts object.say_hi
     exit(0) # exit program immediately
   end

   Starfish.new('hello_world').client
starfish client example                 (another way)


       ARGV.unshift('server.rb')

       require 'rubygems'
       require 'starfish'

       catch(:halt) do
         Starfish.client = lambda do
       |object|
           puts object.say_hi
           throw :halt
         end

         Starfish.new
       ('hello_world').client

       end

       puts "bye bye"
MapReduce

• introduced by Google to support
  distributed computing on large data sets on
  clusters of computers.
• inspired by map and reduce functions
  commonly used in functional programming.
starfish server example
ARGV.unshift('server.rb')

require 'rubygems'
require 'starfish'

Starfish.server = lambda{ |map_reduce|
  map_reduce.type = File
  map_reduce.input = "/var/log/apache2/access.log"
  map_reduce.queue_size = 10
  map_reduce.lines_per_client = 5
  map_reduce.rescan_when_complete = false
}

Starfish.new('log_server').server
starfish client example
   ARGV.unshift('client.rb')

   require 'rubygems'
   require 'starfish'

   Starfish.client = lambda { |logs|
     logs.each do |log|
       puts "Processing #{log}"
       sleep(1)
     end
   }

   Starfish.new("log_server").client
Other implementations
• Skynet
 • Use TupleSpace or MySQL as message queue
 • Include an extension for ActiveRecord
 • http://skynet.rubyforge.org/
• MRToolkit based on Hadoop
 • http://code.google.com/p/mrtoolkit/
MagLev VM

• a fast, stable, Ruby implementation with
  integrated object persistence and
  distributed shared cache.
• http://maglev.gemstone.com/
• public Alpha currently
2.Distributed Message
       Queues

• Starling
• AMQP/RabbitMQ
• Stomp/ActiveMQ
• beanstalkd
what’s message queue?
          Message X
 Client                Queue



                      Check and processing




          Processor
Why not DRb?

• DRb has security risk and poorly designed APIs
• distributed message queue is a great way to do
  distributed programming: reliable and scalable.
Starling
• a light-weight persistent queue server that
  speaks the Memcache protocol (mimics its
  API)
• Fast, effective, quick setup and ease of use
• Powered by EventMachine
  http://eventmachine.rubyforge.org/EventMachine.html



• Twitter’s open source project, they use it
  before 2009. (now switch to Kestrel, a port of Starling from Ruby
  to Scala)
Starling command

• sudo gem install starling-starling
 • http://github.com/starling/starling
• sudo starling -h 192.168.1.100
• sudo starling_top -h 192.168.1.100
Starling set example
require 'rubygems'
require 'starling'

starling = Starling.new('192.168.1.4:22122')

100.times do |i|
  starling.set('my_queue', i)
end

                     append to the queue, not
                     overwrite in Memcached
Starling get example
require 'rubygems'
require 'starling'

starling = Starling.new('192.168.2.4:22122')

loop do
  puts starling.get("my_queue")
end
get method
• FIFO
• After get, the object is no longer in the
  queue. You will lost message if processing
  error happened.
• The get method blocks until something is
  returned. It’s infinite loop.
Handle processing
 error exception
 require 'rubygems'
 require 'starling'

 starling = Starling.new('192.168.2.4:22122')
 results = starling.get("my_queue")

 begin
     puts results.flatten
 rescue NoMethodError => e
     puts e.message
     Starling.set("my_queue", [results])
 rescue Exception => e
     Starling.set("my_queue", results)
     raise e
 end
Starling cons

• Poll queue constantly
• RabbitMQ can subscribe to a queue that
  notify you when a message is available for
  processing.
AMQP/RabbitMQ
• a complete and highly reliable enterprise
  messaging system based on the emerging
  AMQP standard.
  • Erlang
• http://github.com/tmm1/amqp
 • Powered by EventMachine
Stomp/ActiveMQ

• Apache ActiveMQ is the most popular and
  powerful open source messaging and
  Integration Patterns provider.
• sudo gem install stomp
• ActiveMessaging plugin for Rails
beanstalkd
• Beanstalk is a simple, fast workqueue
  service. Its interface is generic, but was
  originally designed for reducing the latency
  of page views in high-volume web
  applications by running time-consuming tasks
  asynchronously.
• http://kr.github.com/beanstalkd/
• http://beanstalk.rubyforge.org/
• Facebook’s open source project
Why we need asynchronous/
 background-processing in Rails?

• cron-like processing
  text search index update etc)
                                      (compute daily statistics data, create reports, Full-



• long-running tasks             (sending mail, resizing photo’s, encoding videos,
  generate PDF, image upload to S3, posting something to twitter etc)


 • Server traffic jam: expensive request will block
     server resources(i.e. your Rails app)
  • Bad user experience: they maybe try to reload
     and reload again! (responsive matters)
3.Background-
   processing for Rails
• script/runner
• rake
• cron
• daemon
• run_later plugin
• spawn plugin
script/runner


• In Your Rails App root:
• script/runner “Worker.process”
rake

• In RAILS_ROOT/lib/tasks/dev.rake
• rake dev:process
  namespace :dev do
    task :process do
          #...
    end
  end
cron

• Cron is a time-based job scheduler in Unix-
  like computer operating systems.
• crontab -e
Whenever
          http://github.com/javan/whenever

•   A Ruby DSL for Defining Cron Jobs

• http://asciicasts.com/episodes/164-cron-in-ruby
• or http://cronedit.rubyforge.org/
          every 3.hours do
            runner "MyModel.some_process"
            rake "my:rake:task"
            command "/usr/bin/my_great_command"
          end
Daemon

• http://daemons.rubyforge.org/
• http://github.com/dougal/daemon_generator/
rufus-scheduler
   http://github.com/jmettraux/rufus-scheduler


• scheduling pieces of code (jobs)
• Not replacement for cron/at since it runs
  inside of Ruby.
           require 'rubygems'
           require 'rufus/scheduler'

           scheduler =
           Rufus::Scheduler.start_new

           scheduler.every '5s' do
               puts 'check blood pressure'
           end

           scheduler.join
Daemon Kit
   http://github.com/kennethkalmer/daemon-kit



• Creating Ruby daemons by providing a
  sound application skeleton (through a
  generator), task specific generators (jabber
  bot, etc) and robust environment
  management code.
Monitor your daemon

• http://mmonit.com/monit/
• http://github.com/arya/bluepill
• http://god.rubyforge.org/
daemon_controller
http://github.com/FooBarWidget/daemon_controller




• A library for robust daemon management
• Make daemon-dependent applications Just
  Work without having to start the daemons
  manually.
off-load task via system
       command
# mailings_controller.rb
def deliver
  call_rake :send_mailing, :mailing_id => params[:id].to_i
  flash[:notice] = "Delivering mailing"
  redirect_to mailings_url
end

# controllers/application.rb
def call_rake(task, options = {})
  options[:rails_env] ||= Rails.env
  args = options.map { |n, v| "#{n.to_s.upcase}='#{v}'" }
  system "/usr/bin/rake #{task} #{args.join(' ')} --trace 2>&1 >> #{Rails.root}/log/rake.log &"
end

# lib/tasks/mailer.rake
desc "Send mailing"
task :send_mailing => :environment do
  mailing = Mailing.find(ENV["MAILING_ID"])
  mailing.deliver
end

# models/mailing.rb
def deliver
  sleep 10 # placeholder for sending email
  update_attribute(:delivered_at, Time.now)
end
Simple Thread

after_filter do
    Thread.new do
        AccountMailer.deliver_signup(@user)
    end
end
run_later plugin
      http://github.com/mattmatt/run_later


• Borrowed from Merb
• Uses worker thread and a queue
• Simple solution for simple tasks
  run_later do
      AccountMailer.deliver_signup(@user)
  end
spawn plugin
http://github.com/tra/spawn


  spawn do
    logger.info("I feel sleepy...")
    sleep 11
    logger.info("Time to wake up!")
  end
spawn (cont.)
• By default, spawn will use the fork to spawn
  child processes.You can configure it to do
  threading.
• Works by creating new database
  connections in ActiveRecord::Base for the
  spawned block.
• Fock need copy Rails every time
threading vs. forking
•   Forking advantages:
    •   more reliable? - the ActiveRecord code is not thread-safe.
    •   keep running - subprocess can live longer than its parent.
    •   easier - just works with Rails default settings. Threading
        requires you set allow_concurrency=true and. Also,
        beware of automatic reloading of classes in development
        mode (config.cache_classes = false).
•   Threading advantages:
    •   less filling - threads take less resources... how much less?
        it depends.
    •   debugging - you can set breakpoints in your threads
Okay, we need
    reliable messaging system:
•   Persistent
•   Scheduling: not necessarily all at the same time
•   Scalability: just throw in more instances of your
    program to speed up processing
•   Loosely coupled components that merely ‘talk’
    to each other
•   Ability to easily replace Ruby with something
    else for specific tasks
•   Easy to debug and monitor
4.Message Queues
     (for Rails only)
• ar_mailer
• BackgroundDRb
• workling
• delayed_job
• resque
Rails only?

• Easy to use/write code
• Jobs are Ruby classes or objects
• But need to load Rails environment
ar_mailer
       http://seattlerb.rubyforge.org/ar_mailer/



• a two-phase delivery agent for ActionMailer.
 • Store messages into the database
 • Delivery by a separate process, ar_sendmail
    later.
BackgroundDRb
            http://backgroundrb.rubyforge.org/

• BackgrounDRb is a Ruby job server and
  scheduler.
• Have scalability problem due to
  Mark Bates)
                                         (~20 servers for



• Hard to know if processing error
• Use database to persist tasks
• Use memcached to know processing result
workling
     http://github.com/purzelrakete/workling




• Gives your Rails App a simple API that you
  can use to make code run in the
  background, outside of the your request.
• Supports Starling(default), BackgroundJob,
  Spawn and AMQP/RabbitMQ Runners.
Workling/Starling
         setup
• script/plugin install git://github.com/purzelrae/
  workling.git
• sudo starling -p 15151
• RAILS_ENV=production script/
  workling_client start
Workling example
 class EmailWorker < Workling::Base
   def deliver(options)
     user = User.find(options[:id])
     user.deliver_activation_email
   end
 end


 # in your controller
 def create
     EmailWorker.asynch_deliver( :id => 1)
 end
delayed_job
• Database backed asynchronous priority
  queue
• Extracted from Shopify
• you can place any Ruby object on its queue
  as arguments
• Only load the Rails environment only once
delayed_job setup
                (use fork version)




• script/plugin install git://github.com/
  collectiveidea/delayed_job.git
• script/generate delayed_job
• rake db:migrate
delayed_job example
     send_later
def deliver
  mailing = Mailing.find(params[:id])
  mailing.send_later(:deliver)
  flash[:notice] = "Mailing is being delivered."
  redirect_to mailings_url
end
delayed_job example
  custom workers
class MailingJob < Struct.new(:mailing_id)

  def perform
    mailing = Mailing.find(mailing_id)
    mailing.deliver
  end

end

# in your controller
def deliver
  Delayed::Job.enqueue(MailingJob.new(params[:id]))
  flash[:notice] = "Mailing is being delivered."
  redirect_to mailings_url
end
delayed_job example
       always asynchronously


   class Device
     def deliver
       # long running method
     end
     handle_asynchronously :deliver
   end

   device = Device.new
   device.deliver
Running jobs

• rake jobs:works
  (Don’t use in production, it will exit if the database has any network connectivity
  problems.)


• RAILS_ENV=production script/delayed_job start
• RAILS_ENV=production script/delayed_job stop
Priority
                  just Integer, default is 0

• you can run multipie workers to handle different
  priority jobs
• RAILS_ENV=production script/delayed_job -min-
  priority 3 start

  Delayed::Job.enqueue(MailingJob.new(params[:id]), 3)

  Delayed::Job.enqueue(MailingJob.new(params[:id]), -3)
Scheduled
        no guarantees at precise time, just run_after_at



Delayed::Job.enqueue(MailingJob.new(params[:id]), 3, 3.days.from_now)

Delayed::Job.enqueue(MailingJob.new(params[:id]),
                                    3, 1.month.from_now.beginning_of_month)
Configuring Dealyed
        Job
# config/initializers/delayed_job_config.rb
Delayed::Worker.destroy_failed_jobs = false
Delayed::Worker.sleep_delay = 5 # sleep if empty queue
Delayed::Worker.max_attempts = 25
Delayed::Worker.max_run_time = 4.hours # set to the amount of time
of longest task will take
Automatic retry on failure
 • If a method throws an exception it will be
   caught and the method rerun later.
 • The method will be retried up to 25
   (default) times at increasingly longer
   intervals until it passes.
   • 108 hours at most
     Job.db_time_now + (job.attempts ** 4) + 5
Capistrano Recipes
• Remember to restart delayed_job after
  deployment
• Check out lib/delayed_job/recipes.rb
   after "deploy:stop",    "delayed_job:stop"
   after "deploy:start",   "delayed_job:start"
   after "deploy:restart", "delayed_job:restart"
Resque
             http://github.com/defunkt/resque

•   a Redis-backed library for creating background jobs,
    placing those jobs on multiple queues, and processing
    them later.
•   Github’s open source project
•   you can only place JSONable Ruby objects
•   includes a Sinatra app for monitoring what's going on
•   support multiple queues
•   you expect a lot of failure/chaos
My recommendations:

• General purpose: delayed_job
  (Github highly recommend DelayedJob to anyone whose site is not 50% background work.)



• Time-scheduled: cron + rake
5. SOA for Rails

• What’s SOA
• Why SOA
• Considerations
• The tool set
What’s SOA
           Service oriented architectures



• “monolithic” approach is not enough
• SOA is a way to design complex applications
  by splitting out major components into
  individual services and communicating via
  APIs.
• a service is a vertical slice of functionality:
  database, application code and caching layer
a monolithic web app example
                 request




             Load
            Balancer




            WebApps




            Database
a SOA example
                                     request




                                 Load
       request
                                Balancer



     WebApp                  WebApps
for Administration           for User




       Services A    Services B




        Database     Database
Why SOA? Isolation
• Shared Resources
• Encapsulation
• Scalability
• Interoperability
• Reuse
• Testability
• Reduce Local Complexity
Shared Resources
• Different front-web website use the same
  resource.
• SOA help you avoiding duplication databases
  and code.
• Why not only shared database?
 • code is not DRY                 WebApp
                              for Administration
                                                      WebApps
                                                      for User


 • caching will be problematic
                                               Database
Encapsulation

• you can change underly implementation in
  services without affect other parts of system
 • upgrade library
 • upgrade to Ruby 1.9
• you can provide API versioning
Scalability1: Partitioned
     Data Provides
•   Database is the first bottleneck, a single DB
    server can not scale. SOA help you reduce
    database load
•   Anti-pattern: only split the database              WebApps


    •   model relationship is broken
    •   referential integrity               Database
                                               A
                                                                 Database
                                                                    B


•   Myth: database replication can not help you
    speed and consistency
Scalability 2: Caching

• SOA help you design caching system easier
 • Cache data at the right times and expire
    at the right times
 • Cache logical model, not physical
 • You do not need cache view everywhere
Scalability 3: Efficient
• Different components have different task
  loading, SOA can scale by service.

                               WebApps



              Load
             Balancer                                 Load
                                                     Balancer




    Services A    Services A    Services B   Services B    Services B   Services B
Security

• Different services can be inside different
  firewall
  • You can only open public web and
    services, others are inside firewall.
Interoperability
• HTTP is the common interface, SOA help
  you integrate them:
 • Multiple languages
 • Internal system e.g. Full-text searching engine
 • Legacy database, system
 • External vendors
Reuse

• Reuse across multiple applications
• Reuse for public APIs
• Example: Amazon Web Services (AWS)
Testability

• Isolate problem
• Mocking API calls
 • Reduce the time to run test suite
Reduce Local
         Complexity
• Team modularity along the same module
  splits as your software
• Understandability: The amount of code is
  minimized to a quantity understandable by
  a small team
• Source code control
Considerations

• Partition into Separate Services
• API Design
• Which Protocol
How to partition into
 Separate Services
• Partitioning on Logical Function
• Partitioning on Read/Write Frequencies
• Partitioning by Minimizing Joins
• Partitioning by Iteration Speed
API Design

• Send Everything you need
• Parallel HTTP requests
• Send as Little as Possible
• Use Logical Models
Physical Models &
     Logical Models
• Physical models are mapped to database
  tables through ORM. (It’s 3NF)
• Logical models are mapped to your
  business problem. (External API use it)
• Logical models are mapped to physical
  models by you.
Logical Models
• Not relational or normalized
• Maintainability
  • can change with no change to data store
  • can stay the same while the data store
    changes
• Better fit for REST interfaces
• Better caching
Which Protocol?

• SOAP
• XML-RPC
• REST
RESTful Web services

• Rails way
• REST is about resources
 • URL
 • Verbs: GET/PUT/POST/DELETE
The tool set

• Web framework
• XML Parser
• JSON Parser
• HTTP Client
Web framework

• We do not need controller, view too much
• Rails is a little more, how about Sinatra?
• Rails metal
ActiveResource

• Mapping RESTful resources as models in a
  Rails application.
• But not useful in practice, why?
XML parser

• http://nokogiri.org/
• Nokogiri ( ) is an HTML, XML, SAX, and
  Reader parser. Among Nokogiri’s many
  features is the ability to search documents
  via XPath or CSS3 selectors.
JSON Parser

• http://github.com/brianmario/yajl-ruby/
• An extremely efficient streaming JSON
  parsing and encoding library. Ruby C
  bindings to Yajl
HTTP Client


• http://github.com/pauldix/typhoeus/
• Typhoeus runs HTTP requests in parallel
  while cleanly encapsulating handling logic
Tips

• Define your logical model (i.e. your service
  request result) first.

• model.to_json and model.to_xml is easy to
  use, but not useful in practice.
6.Distributed File System
 •   NFS not scale
     •   we can use rsync to duplicate
 •   MogileFS
     •   http://www.danga.com/mogilefs/
     •   http://seattlerb.rubyforge.org/mogilefs-client/
 •   Amazon S3
 •   HDFS (Hadoop Distributed File System)
 •   GlusterFS
7.Distributed Database

• NoSQL
• CAP theorem
 • Eventually consistent
• HBase/Cassandra/Voldemort
The End
References
•   Books&Articles:
    •    Distributed Programming with Ruby, Mark Bates (Addison Wesley)
    •    Enterprise Rails, Dan Chak (O’Reilly)
    •    Service-Oriented Design with Ruby and Rails, Paul Dix (Addison Wesley)
    •    RESTful Web Services, Richardson&Ruby (O’Reilly)
    •    RESTful WEb Services Cookbook, Allamaraju&Amundsen (O’Reilly)
    •    Enterprise Recipes with Ruby on Rails, Maik Schmidt (The Pragmatic Programmers)
    •    Ruby in Practice, McAnally&Arkin (Manning)

    •    Building Scalable Web Sites, Cal Henderson (O’Reilly)
    •    Background Processing in Rails, Erik Andrejko (Rails Magazine)
    •    Background Processing with Delayed_Job, James Harrison (Rails Magazine)
    •    Bulinging Scalable Web Sites, Cal Henderson (O’Reilly)
    •                 Web   点          (                 )
•   Slides:
    •    Background Processing (Rob Mack) Austin on Rails - April 2009
    •    The Current State of Asynchronous Processing in Ruby (Mathias Meyer, Peritor GmbH)
    •    Asynchronous Processing (Jonathan Dahl)
    •    Long-Running Tasks In Rails Without Much Effort (Andy Stewart) - April 2008
    •    Starling + Workling: simple distributed background jobs with Twitter’s queuing system, Rany Keddo 2008
    •    Physical Models & Logical Models in Rails, dan chak
References
•   Links:
    •   http://segment7.net/projects/ruby/drb/
    •   http://www.slideshare.net/luccastera/concurrent-programming-with-ruby-and-tuple-spaces
    •   http://github.com/blog/542-introducing-resque
    •   http://www.engineyard.com/blog/2009/5-tips-for-deploying-background-jobs/
    •   http://www.opensourcery.co.za/2008/07/07/messaging-and-ruby-part-1-the-big-picture/
    •   http://leemoonsoo.blogspot.com/2009/04/simple-comparison-open-source.html
    •   http://blog.gslin.org/archives/2009/07/25/2065/
    •   http://www.javaeye.com/topic/524977
    •   http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
Todo (maybe next time)
•   AMQP/RabbitMQ example code
    •   How about Nanite?
•   XMPP
•   MagLev VM
•   More MapReduce example code
    •   How about Amazon Elastic MapReduce?
•   Resque example code
•   More SOA example and code
•   MogileFS example code

Weitere ähnliche Inhalte

Was ist angesagt?

How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksEDB
 
JupyterHub - A "Thing Explainer" Overview
JupyterHub - A "Thing Explainer" OverviewJupyterHub - A "Thing Explainer" Overview
JupyterHub - A "Thing Explainer" OverviewCarol Willing
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
MySQL Deep dive with FusionIO
MySQL Deep dive with FusionIOMySQL Deep dive with FusionIO
MySQL Deep dive with FusionIOI Goo Lee
 
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...Databricks
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
Let's Talk Locks!
Let's Talk Locks!Let's Talk Locks!
Let's Talk Locks!C4Media
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsZohar Elkayam
 
RootedCON 2015 - Deep inside the Java framework Apache Struts
RootedCON 2015 - Deep inside the Java framework Apache StrutsRootedCON 2015 - Deep inside the Java framework Apache Struts
RootedCON 2015 - Deep inside the Java framework Apache Strutstestpurposes
 
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...MinhLeNguyenAnh2
 
Understanding Active Directory Enumeration
Understanding Active Directory EnumerationUnderstanding Active Directory Enumeration
Understanding Active Directory EnumerationDaniel López Jiménez
 
Hitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsHitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsBjoern Rost
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Traceoysteing
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Databricks
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingAmir Reza Hashemi
 
Debugging PySpark - PyCon US 2018
Debugging PySpark -  PyCon US 2018Debugging PySpark -  PyCon US 2018
Debugging PySpark - PyCon US 2018Holden Karau
 
How to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla DeploymentHow to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla DeploymentScyllaDB
 

Was ist angesagt? (20)

How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer Works
 
JupyterHub - A "Thing Explainer" Overview
JupyterHub - A "Thing Explainer" OverviewJupyterHub - A "Thing Explainer" Overview
JupyterHub - A "Thing Explainer" Overview
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
MySQL Deep dive with FusionIO
MySQL Deep dive with FusionIOMySQL Deep dive with FusionIO
MySQL Deep dive with FusionIO
 
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...
Spark SQL Adaptive Execution Unleashes The Power of Cluster in Large Scale wi...
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Let's Talk Locks!
Let's Talk Locks!Let's Talk Locks!
Let's Talk Locks!
 
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAsOracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
 
RootedCON 2015 - Deep inside the Java framework Apache Struts
RootedCON 2015 - Deep inside the Java framework Apache StrutsRootedCON 2015 - Deep inside the Java framework Apache Struts
RootedCON 2015 - Deep inside the Java framework Apache Struts
 
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...
PostgreSQL_ Up and Running_ A Practical Guide to the Advanced Open Source Dat...
 
Understanding Active Directory Enumeration
Understanding Active Directory EnumerationUnderstanding Active Directory Enumeration
Understanding Active Directory Enumeration
 
Hitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning toolsHitchhiker's Guide to free Oracle tuning tools
Hitchhiker's Guide to free Oracle tuning tools
 
Indexes in postgres
Indexes in postgresIndexes in postgres
Indexes in postgres
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
 
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
Using Delta Lake to Transform a Legacy Apache Spark to Support Complex Update...
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
 
Debugging PySpark - PyCon US 2018
Debugging PySpark -  PyCon US 2018Debugging PySpark -  PyCon US 2018
Debugging PySpark - PyCon US 2018
 
How to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla DeploymentHow to Bulletproof Your Scylla Deployment
How to Bulletproof Your Scylla Deployment
 

Andere mochten auch

Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)martinbtt
 
IntelliSemantc - Second generation semantic technologies for patents
IntelliSemantc - Second generation semantic technologies for patentsIntelliSemantc - Second generation semantic technologies for patents
IntelliSemantc - Second generation semantic technologies for patentsAlberto Ciaramella
 
Make your app idea a reality with Ruby On Rails
Make your app idea a reality with Ruby On RailsMake your app idea a reality with Ruby On Rails
Make your app idea a reality with Ruby On RailsNataly Tkachuk
 
ActiveRecord Validations, Season 2
ActiveRecord Validations, Season 2ActiveRecord Validations, Season 2
ActiveRecord Validations, Season 2RORLAB
 
ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1RORLAB
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Wen-Tien Chang
 
Rails Engine Patterns
Rails Engine PatternsRails Engine Patterns
Rails Engine PatternsAndy Maleh
 
Proyecto festival
Proyecto festivalProyecto festival
Proyecto festivalfestimango
 
ActiveWarehouse/ETL - BI & DW for Ruby/Rails
ActiveWarehouse/ETL - BI & DW for Ruby/RailsActiveWarehouse/ETL - BI & DW for Ruby/Rails
ActiveWarehouse/ETL - BI & DW for Ruby/RailsPaul Gallagher
 
6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friendForrest Chang
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsSerge Smetana
 
Neev Expertise in Ruby on Rails (RoR)
Neev Expertise in Ruby on Rails (RoR)Neev Expertise in Ruby on Rails (RoR)
Neev Expertise in Ruby on Rails (RoR)Neev Technologies
 
Introduction to Ruby on Rails
Introduction to Ruby on RailsIntroduction to Ruby on Rails
Introduction to Ruby on RailsAgnieszka Figiel
 
Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...
 Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank... Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...
Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...ThoughtWorks
 

Andere mochten auch (20)

Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)
 
ESPM 2009
ESPM 2009ESPM 2009
ESPM 2009
 
IntelliSemantc - Second generation semantic technologies for patents
IntelliSemantc - Second generation semantic technologies for patentsIntelliSemantc - Second generation semantic technologies for patents
IntelliSemantc - Second generation semantic technologies for patents
 
Make your app idea a reality with Ruby On Rails
Make your app idea a reality with Ruby On RailsMake your app idea a reality with Ruby On Rails
Make your app idea a reality with Ruby On Rails
 
ActiveRecord Validations, Season 2
ActiveRecord Validations, Season 2ActiveRecord Validations, Season 2
ActiveRecord Validations, Season 2
 
ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1ActiveRecord Query Interface (1), Season 1
ActiveRecord Query Interface (1), Season 1
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
 
Rails Engine Patterns
Rails Engine PatternsRails Engine Patterns
Rails Engine Patterns
 
Proyecto festival
Proyecto festivalProyecto festival
Proyecto festival
 
ActiveWarehouse/ETL - BI & DW for Ruby/Rails
ActiveWarehouse/ETL - BI & DW for Ruby/RailsActiveWarehouse/ETL - BI & DW for Ruby/Rails
ActiveWarehouse/ETL - BI & DW for Ruby/Rails
 
6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend
 
Rails Performance
Rails PerformanceRails Performance
Rails Performance
 
Performance Optimization of Rails Applications
Performance Optimization of Rails ApplicationsPerformance Optimization of Rails Applications
Performance Optimization of Rails Applications
 
Neev Expertise in Ruby on Rails (RoR)
Neev Expertise in Ruby on Rails (RoR)Neev Expertise in Ruby on Rails (RoR)
Neev Expertise in Ruby on Rails (RoR)
 
Introduction to Ruby on Rails
Introduction to Ruby on RailsIntroduction to Ruby on Rails
Introduction to Ruby on Rails
 
Ruby on Rails
Ruby on RailsRuby on Rails
Ruby on Rails
 
IoT×Emotion
IoT×EmotionIoT×Emotion
IoT×Emotion
 
Ruby on Rails Presentation
Ruby on Rails PresentationRuby on Rails Presentation
Ruby on Rails Presentation
 
Ruby Beyond Rails
Ruby Beyond RailsRuby Beyond Rails
Ruby Beyond Rails
 
Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...
 Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank... Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...
Ruby on Rails versus Django - A newbie Web Developer's Perspective -Shreyank...
 

Ähnlich wie Distributed Ruby and Rails

DRb at the Ruby Drink-up of Sophia, December 2011
DRb at the Ruby Drink-up of Sophia, December 2011DRb at the Ruby Drink-up of Sophia, December 2011
DRb at the Ruby Drink-up of Sophia, December 2011rivierarb
 
Everything ruby
Everything rubyEverything ruby
Everything rubyajeygore
 
Dragoncraft Architectural Overview
Dragoncraft Architectural OverviewDragoncraft Architectural Overview
Dragoncraft Architectural Overviewjessesanford
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesJimmy Lai
 
Cloud Foundry Open Tour China (english)
Cloud Foundry Open Tour China (english)Cloud Foundry Open Tour China (english)
Cloud Foundry Open Tour China (english)marklucovsky
 
Cloud Foundry Open Tour China
Cloud Foundry Open Tour ChinaCloud Foundry Open Tour China
Cloud Foundry Open Tour Chinamarklucovsky
 
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...Ritta Narita
 
Middleware as Code with mruby
Middleware as Code with mrubyMiddleware as Code with mruby
Middleware as Code with mrubyHiroshi SHIBATA
 
FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceEvan McGee
 
DRb and Rinda
DRb and RindaDRb and Rinda
DRb and RindaMark
 
D Rb Silicon Valley Ruby Conference
D Rb   Silicon Valley Ruby ConferenceD Rb   Silicon Valley Ruby Conference
D Rb Silicon Valley Ruby Conferencenextlib
 
Rails web api 开发
Rails web api 开发Rails web api 开发
Rails web api 开发shaokun
 
TorqueBox - Ruby Hoedown 2011
TorqueBox - Ruby Hoedown 2011TorqueBox - Ruby Hoedown 2011
TorqueBox - Ruby Hoedown 2011Lance Ball
 
A language for the Internet: Why JavaScript and Node.js is right for Internet...
A language for the Internet: Why JavaScript and Node.js is right for Internet...A language for the Internet: Why JavaScript and Node.js is right for Internet...
A language for the Internet: Why JavaScript and Node.js is right for Internet...Tom Croucher
 
Ruby on Rails All Hands Meeting
Ruby on Rails All Hands MeetingRuby on Rails All Hands Meeting
Ruby on Rails All Hands MeetingDan Davis
 
Aws Lambda in Swift - NSLondon - 3rd December 2020
Aws Lambda in Swift - NSLondon - 3rd December 2020Aws Lambda in Swift - NSLondon - 3rd December 2020
Aws Lambda in Swift - NSLondon - 3rd December 2020Andrea Scuderi
 
Writing robust Node.js applications
Writing robust Node.js applicationsWriting robust Node.js applications
Writing robust Node.js applicationsTom Croucher
 

Ähnlich wie Distributed Ruby and Rails (20)

DRb at the Ruby Drink-up of Sophia, December 2011
DRb at the Ruby Drink-up of Sophia, December 2011DRb at the Ruby Drink-up of Sophia, December 2011
DRb at the Ruby Drink-up of Sophia, December 2011
 
Everything ruby
Everything rubyEverything ruby
Everything ruby
 
Dragoncraft Architectural Overview
Dragoncraft Architectural OverviewDragoncraft Architectural Overview
Dragoncraft Architectural Overview
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languages
 
Cloud Foundry Open Tour China (english)
Cloud Foundry Open Tour China (english)Cloud Foundry Open Tour China (english)
Cloud Foundry Open Tour China (english)
 
Cloud Foundry Open Tour China
Cloud Foundry Open Tour ChinaCloud Foundry Open Tour China
Cloud Foundry Open Tour China
 
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...
How to create multiprocess server on windows with ruby - rubykaigi2016 Ritta ...
 
Middleware as Code with mruby
Middleware as Code with mrubyMiddleware as Code with mruby
Middleware as Code with mruby
 
FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
 
DRb and Rinda
DRb and RindaDRb and Rinda
DRb and Rinda
 
D Rb Silicon Valley Ruby Conference
D Rb   Silicon Valley Ruby ConferenceD Rb   Silicon Valley Ruby Conference
D Rb Silicon Valley Ruby Conference
 
Get your teeth into Plack
Get your teeth into PlackGet your teeth into Plack
Get your teeth into Plack
 
Rails web api 开发
Rails web api 开发Rails web api 开发
Rails web api 开发
 
TorqueBox - Ruby Hoedown 2011
TorqueBox - Ruby Hoedown 2011TorqueBox - Ruby Hoedown 2011
TorqueBox - Ruby Hoedown 2011
 
A language for the Internet: Why JavaScript and Node.js is right for Internet...
A language for the Internet: Why JavaScript and Node.js is right for Internet...A language for the Internet: Why JavaScript and Node.js is right for Internet...
A language for the Internet: Why JavaScript and Node.js is right for Internet...
 
Ruby on Rails All Hands Meeting
Ruby on Rails All Hands MeetingRuby on Rails All Hands Meeting
Ruby on Rails All Hands Meeting
 
Aws Lambda in Swift - NSLondon - 3rd December 2020
Aws Lambda in Swift - NSLondon - 3rd December 2020Aws Lambda in Swift - NSLondon - 3rd December 2020
Aws Lambda in Swift - NSLondon - 3rd December 2020
 
Writing robust Node.js applications
Writing robust Node.js applicationsWriting robust Node.js applications
Writing robust Node.js applications
 
locize tech talk
locize tech talklocize tech talk
locize tech talk
 
Docker tlv
Docker tlvDocker tlv
Docker tlv
 

Mehr von Wen-Tien Chang

⼤語⾔模型 LLM 應⽤開發入⾨
⼤語⾔模型 LLM 應⽤開發入⾨⼤語⾔模型 LLM 應⽤開發入⾨
⼤語⾔模型 LLM 應⽤開發入⾨Wen-Tien Chang
 
Ruby Rails 老司機帶飛
Ruby Rails 老司機帶飛Ruby Rails 老司機帶飛
Ruby Rails 老司機帶飛Wen-Tien Chang
 
A brief introduction to Machine Learning
A brief introduction to Machine LearningA brief introduction to Machine Learning
A brief introduction to Machine LearningWen-Tien Chang
 
淺談 Startup 公司的軟體開發流程 v2
淺談 Startup 公司的軟體開發流程 v2淺談 Startup 公司的軟體開發流程 v2
淺談 Startup 公司的軟體開發流程 v2Wen-Tien Chang
 
RSpec on Rails Tutorial
RSpec on Rails TutorialRSpec on Rails Tutorial
RSpec on Rails TutorialWen-Tien Chang
 
ALPHAhackathon: How to collaborate
ALPHAhackathon: How to collaborateALPHAhackathon: How to collaborate
ALPHAhackathon: How to collaborateWen-Tien Chang
 
Git 版本控制系統 -- 從微觀到宏觀
Git 版本控制系統 -- 從微觀到宏觀Git 版本控制系統 -- 從微觀到宏觀
Git 版本控制系統 -- 從微觀到宏觀Wen-Tien Chang
 
Exception Handling: Designing Robust Software in Ruby (with presentation note)
Exception Handling: Designing Robust Software in Ruby (with presentation note)Exception Handling: Designing Robust Software in Ruby (with presentation note)
Exception Handling: Designing Robust Software in Ruby (with presentation note)Wen-Tien Chang
 
Exception Handling: Designing Robust Software in Ruby
Exception Handling: Designing Robust Software in RubyException Handling: Designing Robust Software in Ruby
Exception Handling: Designing Robust Software in RubyWen-Tien Chang
 
從 Classes 到 Objects: 那些 OOP 教我的事
從 Classes 到 Objects: 那些 OOP 教我的事從 Classes 到 Objects: 那些 OOP 教我的事
從 Classes 到 Objects: 那些 OOP 教我的事Wen-Tien Chang
 
Yet another introduction to Git - from the bottom up
Yet another introduction to Git - from the bottom upYet another introduction to Git - from the bottom up
Yet another introduction to Git - from the bottom upWen-Tien Chang
 
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩Wen-Tien Chang
 
Ruby 程式語言綜覽簡介
Ruby 程式語言綜覽簡介Ruby 程式語言綜覽簡介
Ruby 程式語言綜覽簡介Wen-Tien Chang
 
A brief introduction to SPDY - 邁向 HTTP/2.0
A brief introduction to SPDY - 邁向 HTTP/2.0A brief introduction to SPDY - 邁向 HTTP/2.0
A brief introduction to SPDY - 邁向 HTTP/2.0Wen-Tien Chang
 
RubyConf Taiwan 2012 Opening & Closing
RubyConf Taiwan 2012 Opening & ClosingRubyConf Taiwan 2012 Opening & Closing
RubyConf Taiwan 2012 Opening & ClosingWen-Tien Chang
 
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean StartupWen-Tien Chang
 
那些 Functional Programming 教我的事
那些 Functional Programming 教我的事那些 Functional Programming 教我的事
那些 Functional Programming 教我的事Wen-Tien Chang
 
RubyConf Taiwan 2011 Opening & Closing
RubyConf Taiwan 2011 Opening & ClosingRubyConf Taiwan 2011 Opening & Closing
RubyConf Taiwan 2011 Opening & ClosingWen-Tien Chang
 

Mehr von Wen-Tien Chang (20)

⼤語⾔模型 LLM 應⽤開發入⾨
⼤語⾔模型 LLM 應⽤開發入⾨⼤語⾔模型 LLM 應⽤開發入⾨
⼤語⾔模型 LLM 應⽤開發入⾨
 
Ruby Rails 老司機帶飛
Ruby Rails 老司機帶飛Ruby Rails 老司機帶飛
Ruby Rails 老司機帶飛
 
A brief introduction to Machine Learning
A brief introduction to Machine LearningA brief introduction to Machine Learning
A brief introduction to Machine Learning
 
淺談 Startup 公司的軟體開發流程 v2
淺談 Startup 公司的軟體開發流程 v2淺談 Startup 公司的軟體開發流程 v2
淺談 Startup 公司的軟體開發流程 v2
 
RSpec on Rails Tutorial
RSpec on Rails TutorialRSpec on Rails Tutorial
RSpec on Rails Tutorial
 
RSpec & TDD Tutorial
RSpec & TDD TutorialRSpec & TDD Tutorial
RSpec & TDD Tutorial
 
ALPHAhackathon: How to collaborate
ALPHAhackathon: How to collaborateALPHAhackathon: How to collaborate
ALPHAhackathon: How to collaborate
 
Git 版本控制系統 -- 從微觀到宏觀
Git 版本控制系統 -- 從微觀到宏觀Git 版本控制系統 -- 從微觀到宏觀
Git 版本控制系統 -- 從微觀到宏觀
 
Exception Handling: Designing Robust Software in Ruby (with presentation note)
Exception Handling: Designing Robust Software in Ruby (with presentation note)Exception Handling: Designing Robust Software in Ruby (with presentation note)
Exception Handling: Designing Robust Software in Ruby (with presentation note)
 
Exception Handling: Designing Robust Software in Ruby
Exception Handling: Designing Robust Software in RubyException Handling: Designing Robust Software in Ruby
Exception Handling: Designing Robust Software in Ruby
 
從 Classes 到 Objects: 那些 OOP 教我的事
從 Classes 到 Objects: 那些 OOP 教我的事從 Classes 到 Objects: 那些 OOP 教我的事
從 Classes 到 Objects: 那些 OOP 教我的事
 
Yet another introduction to Git - from the bottom up
Yet another introduction to Git - from the bottom upYet another introduction to Git - from the bottom up
Yet another introduction to Git - from the bottom up
 
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩
A brief introduction to Vagrant – 原來 VirtualBox 可以這樣玩
 
Ruby 程式語言綜覽簡介
Ruby 程式語言綜覽簡介Ruby 程式語言綜覽簡介
Ruby 程式語言綜覽簡介
 
A brief introduction to SPDY - 邁向 HTTP/2.0
A brief introduction to SPDY - 邁向 HTTP/2.0A brief introduction to SPDY - 邁向 HTTP/2.0
A brief introduction to SPDY - 邁向 HTTP/2.0
 
RubyConf Taiwan 2012 Opening & Closing
RubyConf Taiwan 2012 Opening & ClosingRubyConf Taiwan 2012 Opening & Closing
RubyConf Taiwan 2012 Opening & Closing
 
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup
從 Scrum 到 Kanban: 為什麼 Scrum 不適合 Lean Startup
 
Git Tutorial 教學
Git Tutorial 教學Git Tutorial 教學
Git Tutorial 教學
 
那些 Functional Programming 教我的事
那些 Functional Programming 教我的事那些 Functional Programming 教我的事
那些 Functional Programming 教我的事
 
RubyConf Taiwan 2011 Opening & Closing
RubyConf Taiwan 2011 Opening & ClosingRubyConf Taiwan 2011 Opening & Closing
RubyConf Taiwan 2011 Opening & Closing
 

Kürzlich hochgeladen

UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 

Kürzlich hochgeladen (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 

Distributed Ruby and Rails

  • 1. Distributed Ruby and Rails @ihower http://ihower.tw 2010/1
  • 2. About Me • a.k.a. ihower • http://ihower.tw • http://twitter.com/ihower • http://github.com/ihower • Ruby on Rails Developer since 2006 • Ruby Taiwan Community • http://ruby.tw
  • 3. Agenda • Distributed Ruby • Distributed Message Queues • Background-processing in Rails • Message Queues for Rails • SOA for Rails • Distributed Filesystem • Distributed database
  • 4. 1.Distributed Ruby • DRb • Rinda • Starfish • MapReduce • MagLev VM
  • 5. DRb • Ruby's RMI system (remote method invocation) • an object in one Ruby process can invoke methods on an object in another Ruby process on the same or a different machine
  • 6. DRb (cont.) • no defined interface, faster development time • tightly couple applications, because no defined API, but rather method on objects • unreliable under large-scale, heavy loads production environments
  • 7. server example 1 require 'drb' class HelloWorldServer def say_hello 'Hello, world!' end end DRb.start_service("druby://127.0.0.1:61676", HelloWorldServer.new) DRb.thread.join
  • 8. client example 1 require 'drb' server = DRbObject.new_with_uri("druby://127.0.0.1:61676") puts server.say_hello puts server.inspect # Hello, world! # <DRb::DRbObject:0x1003c04c8 @ref=nil, @uri="druby:// 127.0.0.1:61676">
  • 9. example 2 # user.rb class User attr_accessor :username end
  • 10. server example 2 require 'drb' require 'user' class UserServer attr_accessor :users def find(id) self.users[id-1] end end user_server = UserServer.new user_server.users = [] 5.times do |i| user = User.new user.username = i + 1 user_server.users << user end DRb.start_service("druby://127.0.0.1:61676", user_server) DRb.thread.join
  • 11. client example 2 require 'drb' user_server = DRbObject.new_with_uri("druby://127.0.0.1:61676") user = user_server.find(2) puts user.inspect puts "Username: #{user.username}" user.name = "ihower" puts "Username: #{user.username}"
  • 12. Err... # <DRb::DRbUnknown:0x1003b8318 @name="User", @buf="004bo: tUser006:016@usernameia"> # client2.rb:8: undefined method `username' for #<DRb::DRbUnknown:0x1003b8318> (NoMethodError)
  • 13. Why? DRbUndumped • Default DRb operation • Pass by value • Must share code • With DRbUndumped • Pass by reference • No need to share code
  • 14. Example 2 Fixed # user.rb class User include DRbUndumped attr_accessor :username end # <DRb::DRbObject:0x1003b84f8 @ref=2149433940, @uri="druby://127.0.0.1:61676"> # Username: 2 # Username: ihower
  • 15. Why use DRbUndumped? • Big objects • Singleton objects • Lightweight clients • Rapidly changing software
  • 16. ID conversion • Converts reference into DRb object on server • DRbIdConv (Default) • TimerIdConv • NamedIdConv • GWIdConv
  • 17. Beware of garbage collection • referenced objects may be collected on server (usually doesn't matter) • Building Your own ID Converter if you want to control persistent state.
  • 18. DRb security require 'drb' ro = DRbObject.new_with_uri("druby://127.0.0.1:61676") class << ro undef :instance_eval end # !!!!!!!! WARNING !!!!!!!!! DO NOT RUN ro.instance_eval("`rm -rf *`")
  • 19. $SAFE=1 instance_eval': Insecure operation - instance_eval (SecurityError)
  • 20. DRb security (cont.) • Access Control Lists (ACLs) • via IP address array • still can run denial-of-service attack • DRb over SSL
  • 21. Rinda • Rinda is a Ruby port of Linda distributed computing paradigm. • Linda is a model of coordination and communication among several parallel processes operating upon objects stored in and retrieved from shared, virtual, associative memory. This model is implemented as a "coordination language" in which several primitives operating on ordered sequence of typed data objects, "tuples," are added to a sequential language, such as C, and a logically global associative memory, called a tuplespace, in which processes store and retrieve tuples. (WikiPedia)
  • 22. Rinda (cont.) • Rinda consists of: • a TupleSpace implementation • a RingServer that allows DRb services to automatically discover each other.
  • 23. RingServer • We hardcoded IP addresses in DRb program, it’s tight coupling of applications and make fault tolerance difficult. • RingServer can detect and interact with other services on the network without knowing IP addresses.
  • 24. 1. Where Service X? RingServer via broadcast UDP address 2. Service X: 192.168.1.12 Client @192.1681.100 3. Hi, Service X @ 192.168.1.12 Service X @ 192.168.1.12 4. Hi There 192.168.1.100
  • 25. ring server example require 'rinda/ring' require 'rinda/tuplespace' DRb.start_service Rinda::RingServer.new(Rinda::TupleSpace.new) DRb.thread.join
  • 26. service example require 'rinda/ring' class HelloWorldServer include DRbUndumped # Need for RingServer def say_hello 'Hello, world!' end end DRb.start_service ring_server = Rinda::RingFinger.primary ring_server.write([:hello_world_service, :HelloWorldServer, HelloWorldServer.new, 'I like to say hi!'], Rinda::SimpleRenewer.new) DRb.thread.join
  • 27. client example require 'rinda/ring' DRb.start_service ring_server = Rinda::RingFinger.primary service = ring_server.read([:hello_world_service, nil,nil,nil]) server = service[2] puts server.say_hello puts service.inspect # Hello, world! # [:hello_world_service, :HelloWorldServer, #<DRb::DRbObject:0x10039b650 @uri="druby://fe80::21b:63ff:fec9:335f%en1:57416", @ref=2149388540>, "I like to say hi!"]
  • 28. TupleSpaces • Shared object space • Atomic access • Just like bulletin board • Tuple template is [:name, :Class, object, ‘description’ ]
  • 29. 5 Basic Operations • write • read • take (Atomic Read+Delete) • read_all • notify (Callback for write/take/delete)
  • 30. Starfish • Starfish is a utility to make distributed programming ridiculously easy • It runs both the server and the client in infinite loops • MapReduce with ActiveRecode or Files
  • 31. starfish foo.rb # foo.rb class Foo attr_reader :i def initialize @i = 0 end def inc logger.info "YAY it incremented by 1 up to #{@i}" @i += 1 end end server :log => "foo.log" do |object| object = Foo.new end client do |object| object.inc end
  • 32. starfish server example ARGV.unshift('server.rb') require 'rubygems' require 'starfish' class HelloWorld def say_hi 'Hi There' end end Starfish.server = lambda do |object| object = HelloWorld.new end Starfish.new('hello_world').server
  • 33. starfish client example ARGV.unshift('client.rb') require 'rubygems' require 'starfish' Starfish.client = lambda do |object| puts object.say_hi exit(0) # exit program immediately end Starfish.new('hello_world').client
  • 34. starfish client example (another way) ARGV.unshift('server.rb') require 'rubygems' require 'starfish' catch(:halt) do Starfish.client = lambda do |object| puts object.say_hi throw :halt end Starfish.new ('hello_world').client end puts "bye bye"
  • 35. MapReduce • introduced by Google to support distributed computing on large data sets on clusters of computers. • inspired by map and reduce functions commonly used in functional programming.
  • 36. starfish server example ARGV.unshift('server.rb') require 'rubygems' require 'starfish' Starfish.server = lambda{ |map_reduce| map_reduce.type = File map_reduce.input = "/var/log/apache2/access.log" map_reduce.queue_size = 10 map_reduce.lines_per_client = 5 map_reduce.rescan_when_complete = false } Starfish.new('log_server').server
  • 37. starfish client example ARGV.unshift('client.rb') require 'rubygems' require 'starfish' Starfish.client = lambda { |logs| logs.each do |log| puts "Processing #{log}" sleep(1) end } Starfish.new("log_server").client
  • 38. Other implementations • Skynet • Use TupleSpace or MySQL as message queue • Include an extension for ActiveRecord • http://skynet.rubyforge.org/ • MRToolkit based on Hadoop • http://code.google.com/p/mrtoolkit/
  • 39. MagLev VM • a fast, stable, Ruby implementation with integrated object persistence and distributed shared cache. • http://maglev.gemstone.com/ • public Alpha currently
  • 40. 2.Distributed Message Queues • Starling • AMQP/RabbitMQ • Stomp/ActiveMQ • beanstalkd
  • 41. what’s message queue? Message X Client Queue Check and processing Processor
  • 42. Why not DRb? • DRb has security risk and poorly designed APIs • distributed message queue is a great way to do distributed programming: reliable and scalable.
  • 43. Starling • a light-weight persistent queue server that speaks the Memcache protocol (mimics its API) • Fast, effective, quick setup and ease of use • Powered by EventMachine http://eventmachine.rubyforge.org/EventMachine.html • Twitter’s open source project, they use it before 2009. (now switch to Kestrel, a port of Starling from Ruby to Scala)
  • 44. Starling command • sudo gem install starling-starling • http://github.com/starling/starling • sudo starling -h 192.168.1.100 • sudo starling_top -h 192.168.1.100
  • 45. Starling set example require 'rubygems' require 'starling' starling = Starling.new('192.168.1.4:22122') 100.times do |i| starling.set('my_queue', i) end append to the queue, not overwrite in Memcached
  • 46. Starling get example require 'rubygems' require 'starling' starling = Starling.new('192.168.2.4:22122') loop do puts starling.get("my_queue") end
  • 47. get method • FIFO • After get, the object is no longer in the queue. You will lost message if processing error happened. • The get method blocks until something is returned. It’s infinite loop.
  • 48. Handle processing error exception require 'rubygems' require 'starling' starling = Starling.new('192.168.2.4:22122') results = starling.get("my_queue") begin puts results.flatten rescue NoMethodError => e puts e.message Starling.set("my_queue", [results]) rescue Exception => e Starling.set("my_queue", results) raise e end
  • 49. Starling cons • Poll queue constantly • RabbitMQ can subscribe to a queue that notify you when a message is available for processing.
  • 50. AMQP/RabbitMQ • a complete and highly reliable enterprise messaging system based on the emerging AMQP standard. • Erlang • http://github.com/tmm1/amqp • Powered by EventMachine
  • 51. Stomp/ActiveMQ • Apache ActiveMQ is the most popular and powerful open source messaging and Integration Patterns provider. • sudo gem install stomp • ActiveMessaging plugin for Rails
  • 52. beanstalkd • Beanstalk is a simple, fast workqueue service. Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously. • http://kr.github.com/beanstalkd/ • http://beanstalk.rubyforge.org/ • Facebook’s open source project
  • 53. Why we need asynchronous/ background-processing in Rails? • cron-like processing text search index update etc) (compute daily statistics data, create reports, Full- • long-running tasks (sending mail, resizing photo’s, encoding videos, generate PDF, image upload to S3, posting something to twitter etc) • Server traffic jam: expensive request will block server resources(i.e. your Rails app) • Bad user experience: they maybe try to reload and reload again! (responsive matters)
  • 54. 3.Background- processing for Rails • script/runner • rake • cron • daemon • run_later plugin • spawn plugin
  • 55. script/runner • In Your Rails App root: • script/runner “Worker.process”
  • 56. rake • In RAILS_ROOT/lib/tasks/dev.rake • rake dev:process namespace :dev do task :process do #... end end
  • 57. cron • Cron is a time-based job scheduler in Unix- like computer operating systems. • crontab -e
  • 58. Whenever http://github.com/javan/whenever • A Ruby DSL for Defining Cron Jobs • http://asciicasts.com/episodes/164-cron-in-ruby • or http://cronedit.rubyforge.org/ every 3.hours do runner "MyModel.some_process" rake "my:rake:task" command "/usr/bin/my_great_command" end
  • 60. rufus-scheduler http://github.com/jmettraux/rufus-scheduler • scheduling pieces of code (jobs) • Not replacement for cron/at since it runs inside of Ruby. require 'rubygems' require 'rufus/scheduler' scheduler = Rufus::Scheduler.start_new scheduler.every '5s' do puts 'check blood pressure' end scheduler.join
  • 61. Daemon Kit http://github.com/kennethkalmer/daemon-kit • Creating Ruby daemons by providing a sound application skeleton (through a generator), task specific generators (jabber bot, etc) and robust environment management code.
  • 62. Monitor your daemon • http://mmonit.com/monit/ • http://github.com/arya/bluepill • http://god.rubyforge.org/
  • 63. daemon_controller http://github.com/FooBarWidget/daemon_controller • A library for robust daemon management • Make daemon-dependent applications Just Work without having to start the daemons manually.
  • 64. off-load task via system command # mailings_controller.rb def deliver call_rake :send_mailing, :mailing_id => params[:id].to_i flash[:notice] = "Delivering mailing" redirect_to mailings_url end # controllers/application.rb def call_rake(task, options = {}) options[:rails_env] ||= Rails.env args = options.map { |n, v| "#{n.to_s.upcase}='#{v}'" } system "/usr/bin/rake #{task} #{args.join(' ')} --trace 2>&1 >> #{Rails.root}/log/rake.log &" end # lib/tasks/mailer.rake desc "Send mailing" task :send_mailing => :environment do mailing = Mailing.find(ENV["MAILING_ID"]) mailing.deliver end # models/mailing.rb def deliver sleep 10 # placeholder for sending email update_attribute(:delivered_at, Time.now) end
  • 65. Simple Thread after_filter do Thread.new do AccountMailer.deliver_signup(@user) end end
  • 66. run_later plugin http://github.com/mattmatt/run_later • Borrowed from Merb • Uses worker thread and a queue • Simple solution for simple tasks run_later do AccountMailer.deliver_signup(@user) end
  • 67. spawn plugin http://github.com/tra/spawn spawn do logger.info("I feel sleepy...") sleep 11 logger.info("Time to wake up!") end
  • 68. spawn (cont.) • By default, spawn will use the fork to spawn child processes.You can configure it to do threading. • Works by creating new database connections in ActiveRecord::Base for the spawned block. • Fock need copy Rails every time
  • 69. threading vs. forking • Forking advantages: • more reliable? - the ActiveRecord code is not thread-safe. • keep running - subprocess can live longer than its parent. • easier - just works with Rails default settings. Threading requires you set allow_concurrency=true and. Also, beware of automatic reloading of classes in development mode (config.cache_classes = false). • Threading advantages: • less filling - threads take less resources... how much less? it depends. • debugging - you can set breakpoints in your threads
  • 70. Okay, we need reliable messaging system: • Persistent • Scheduling: not necessarily all at the same time • Scalability: just throw in more instances of your program to speed up processing • Loosely coupled components that merely ‘talk’ to each other • Ability to easily replace Ruby with something else for specific tasks • Easy to debug and monitor
  • 71. 4.Message Queues (for Rails only) • ar_mailer • BackgroundDRb • workling • delayed_job • resque
  • 72. Rails only? • Easy to use/write code • Jobs are Ruby classes or objects • But need to load Rails environment
  • 73. ar_mailer http://seattlerb.rubyforge.org/ar_mailer/ • a two-phase delivery agent for ActionMailer. • Store messages into the database • Delivery by a separate process, ar_sendmail later.
  • 74. BackgroundDRb http://backgroundrb.rubyforge.org/ • BackgrounDRb is a Ruby job server and scheduler. • Have scalability problem due to Mark Bates) (~20 servers for • Hard to know if processing error • Use database to persist tasks • Use memcached to know processing result
  • 75. workling http://github.com/purzelrakete/workling • Gives your Rails App a simple API that you can use to make code run in the background, outside of the your request. • Supports Starling(default), BackgroundJob, Spawn and AMQP/RabbitMQ Runners.
  • 76. Workling/Starling setup • script/plugin install git://github.com/purzelrae/ workling.git • sudo starling -p 15151 • RAILS_ENV=production script/ workling_client start
  • 77. Workling example class EmailWorker < Workling::Base def deliver(options) user = User.find(options[:id]) user.deliver_activation_email end end # in your controller def create EmailWorker.asynch_deliver( :id => 1) end
  • 78. delayed_job • Database backed asynchronous priority queue • Extracted from Shopify • you can place any Ruby object on its queue as arguments • Only load the Rails environment only once
  • 79. delayed_job setup (use fork version) • script/plugin install git://github.com/ collectiveidea/delayed_job.git • script/generate delayed_job • rake db:migrate
  • 80. delayed_job example send_later def deliver mailing = Mailing.find(params[:id]) mailing.send_later(:deliver) flash[:notice] = "Mailing is being delivered." redirect_to mailings_url end
  • 81. delayed_job example custom workers class MailingJob < Struct.new(:mailing_id) def perform mailing = Mailing.find(mailing_id) mailing.deliver end end # in your controller def deliver Delayed::Job.enqueue(MailingJob.new(params[:id])) flash[:notice] = "Mailing is being delivered." redirect_to mailings_url end
  • 82. delayed_job example always asynchronously class Device def deliver # long running method end handle_asynchronously :deliver end device = Device.new device.deliver
  • 83. Running jobs • rake jobs:works (Don’t use in production, it will exit if the database has any network connectivity problems.) • RAILS_ENV=production script/delayed_job start • RAILS_ENV=production script/delayed_job stop
  • 84. Priority just Integer, default is 0 • you can run multipie workers to handle different priority jobs • RAILS_ENV=production script/delayed_job -min- priority 3 start Delayed::Job.enqueue(MailingJob.new(params[:id]), 3) Delayed::Job.enqueue(MailingJob.new(params[:id]), -3)
  • 85. Scheduled no guarantees at precise time, just run_after_at Delayed::Job.enqueue(MailingJob.new(params[:id]), 3, 3.days.from_now) Delayed::Job.enqueue(MailingJob.new(params[:id]), 3, 1.month.from_now.beginning_of_month)
  • 86. Configuring Dealyed Job # config/initializers/delayed_job_config.rb Delayed::Worker.destroy_failed_jobs = false Delayed::Worker.sleep_delay = 5 # sleep if empty queue Delayed::Worker.max_attempts = 25 Delayed::Worker.max_run_time = 4.hours # set to the amount of time of longest task will take
  • 87. Automatic retry on failure • If a method throws an exception it will be caught and the method rerun later. • The method will be retried up to 25 (default) times at increasingly longer intervals until it passes. • 108 hours at most Job.db_time_now + (job.attempts ** 4) + 5
  • 88. Capistrano Recipes • Remember to restart delayed_job after deployment • Check out lib/delayed_job/recipes.rb after "deploy:stop", "delayed_job:stop" after "deploy:start", "delayed_job:start" after "deploy:restart", "delayed_job:restart"
  • 89. Resque http://github.com/defunkt/resque • a Redis-backed library for creating background jobs, placing those jobs on multiple queues, and processing them later. • Github’s open source project • you can only place JSONable Ruby objects • includes a Sinatra app for monitoring what's going on • support multiple queues • you expect a lot of failure/chaos
  • 90. My recommendations: • General purpose: delayed_job (Github highly recommend DelayedJob to anyone whose site is not 50% background work.) • Time-scheduled: cron + rake
  • 91. 5. SOA for Rails • What’s SOA • Why SOA • Considerations • The tool set
  • 92. What’s SOA Service oriented architectures • “monolithic” approach is not enough • SOA is a way to design complex applications by splitting out major components into individual services and communicating via APIs. • a service is a vertical slice of functionality: database, application code and caching layer
  • 93. a monolithic web app example request Load Balancer WebApps Database
  • 94. a SOA example request Load request Balancer WebApp WebApps for Administration for User Services A Services B Database Database
  • 95. Why SOA? Isolation • Shared Resources • Encapsulation • Scalability • Interoperability • Reuse • Testability • Reduce Local Complexity
  • 96. Shared Resources • Different front-web website use the same resource. • SOA help you avoiding duplication databases and code. • Why not only shared database? • code is not DRY WebApp for Administration WebApps for User • caching will be problematic Database
  • 97. Encapsulation • you can change underly implementation in services without affect other parts of system • upgrade library • upgrade to Ruby 1.9 • you can provide API versioning
  • 98. Scalability1: Partitioned Data Provides • Database is the first bottleneck, a single DB server can not scale. SOA help you reduce database load • Anti-pattern: only split the database WebApps • model relationship is broken • referential integrity Database A Database B • Myth: database replication can not help you speed and consistency
  • 99. Scalability 2: Caching • SOA help you design caching system easier • Cache data at the right times and expire at the right times • Cache logical model, not physical • You do not need cache view everywhere
  • 100. Scalability 3: Efficient • Different components have different task loading, SOA can scale by service. WebApps Load Balancer Load Balancer Services A Services A Services B Services B Services B Services B
  • 101. Security • Different services can be inside different firewall • You can only open public web and services, others are inside firewall.
  • 102. Interoperability • HTTP is the common interface, SOA help you integrate them: • Multiple languages • Internal system e.g. Full-text searching engine • Legacy database, system • External vendors
  • 103. Reuse • Reuse across multiple applications • Reuse for public APIs • Example: Amazon Web Services (AWS)
  • 104. Testability • Isolate problem • Mocking API calls • Reduce the time to run test suite
  • 105. Reduce Local Complexity • Team modularity along the same module splits as your software • Understandability: The amount of code is minimized to a quantity understandable by a small team • Source code control
  • 106. Considerations • Partition into Separate Services • API Design • Which Protocol
  • 107. How to partition into Separate Services • Partitioning on Logical Function • Partitioning on Read/Write Frequencies • Partitioning by Minimizing Joins • Partitioning by Iteration Speed
  • 108. API Design • Send Everything you need • Parallel HTTP requests • Send as Little as Possible • Use Logical Models
  • 109. Physical Models & Logical Models • Physical models are mapped to database tables through ORM. (It’s 3NF) • Logical models are mapped to your business problem. (External API use it) • Logical models are mapped to physical models by you.
  • 110. Logical Models • Not relational or normalized • Maintainability • can change with no change to data store • can stay the same while the data store changes • Better fit for REST interfaces • Better caching
  • 111. Which Protocol? • SOAP • XML-RPC • REST
  • 112. RESTful Web services • Rails way • REST is about resources • URL • Verbs: GET/PUT/POST/DELETE
  • 113. The tool set • Web framework • XML Parser • JSON Parser • HTTP Client
  • 114. Web framework • We do not need controller, view too much • Rails is a little more, how about Sinatra? • Rails metal
  • 115. ActiveResource • Mapping RESTful resources as models in a Rails application. • But not useful in practice, why?
  • 116. XML parser • http://nokogiri.org/ • Nokogiri ( ) is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s many features is the ability to search documents via XPath or CSS3 selectors.
  • 117. JSON Parser • http://github.com/brianmario/yajl-ruby/ • An extremely efficient streaming JSON parsing and encoding library. Ruby C bindings to Yajl
  • 118. HTTP Client • http://github.com/pauldix/typhoeus/ • Typhoeus runs HTTP requests in parallel while cleanly encapsulating handling logic
  • 119. Tips • Define your logical model (i.e. your service request result) first. • model.to_json and model.to_xml is easy to use, but not useful in practice.
  • 120. 6.Distributed File System • NFS not scale • we can use rsync to duplicate • MogileFS • http://www.danga.com/mogilefs/ • http://seattlerb.rubyforge.org/mogilefs-client/ • Amazon S3 • HDFS (Hadoop Distributed File System) • GlusterFS
  • 121. 7.Distributed Database • NoSQL • CAP theorem • Eventually consistent • HBase/Cassandra/Voldemort
  • 123. References • Books&Articles: • Distributed Programming with Ruby, Mark Bates (Addison Wesley) • Enterprise Rails, Dan Chak (O’Reilly) • Service-Oriented Design with Ruby and Rails, Paul Dix (Addison Wesley) • RESTful Web Services, Richardson&Ruby (O’Reilly) • RESTful WEb Services Cookbook, Allamaraju&Amundsen (O’Reilly) • Enterprise Recipes with Ruby on Rails, Maik Schmidt (The Pragmatic Programmers) • Ruby in Practice, McAnally&Arkin (Manning) • Building Scalable Web Sites, Cal Henderson (O’Reilly) • Background Processing in Rails, Erik Andrejko (Rails Magazine) • Background Processing with Delayed_Job, James Harrison (Rails Magazine) • Bulinging Scalable Web Sites, Cal Henderson (O’Reilly) • Web 点 ( ) • Slides: • Background Processing (Rob Mack) Austin on Rails - April 2009 • The Current State of Asynchronous Processing in Ruby (Mathias Meyer, Peritor GmbH) • Asynchronous Processing (Jonathan Dahl) • Long-Running Tasks In Rails Without Much Effort (Andy Stewart) - April 2008 • Starling + Workling: simple distributed background jobs with Twitter’s queuing system, Rany Keddo 2008 • Physical Models & Logical Models in Rails, dan chak
  • 124. References • Links: • http://segment7.net/projects/ruby/drb/ • http://www.slideshare.net/luccastera/concurrent-programming-with-ruby-and-tuple-spaces • http://github.com/blog/542-introducing-resque • http://www.engineyard.com/blog/2009/5-tips-for-deploying-background-jobs/ • http://www.opensourcery.co.za/2008/07/07/messaging-and-ruby-part-1-the-big-picture/ • http://leemoonsoo.blogspot.com/2009/04/simple-comparison-open-source.html • http://blog.gslin.org/archives/2009/07/25/2065/ • http://www.javaeye.com/topic/524977 • http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
  • 125. Todo (maybe next time) • AMQP/RabbitMQ example code • How about Nanite? • XMPP • MagLev VM • More MapReduce example code • How about Amazon Elastic MapReduce? • Resque example code • More SOA example and code • MogileFS example code