3. DSC
• Hosting, Web application development, and
consultancy
• Host the crew email system and carried out
the intranet integration for Virgin Atlantic.
• Runs several large forums.
• e.g. pprune.org - 150k members, 2600 at one
time
• Also AskDirect, Mumsnet
5. Working With Rails
• Largest index of Ruby on Rails in the world
• Over 7000 people listed
• From 104 countries
• Find out who’s who?
• Connect with others
• Find a developer for a project / employment
• Also lists groups, companies, and sites
15. Issues
• No longer supported by author
• Feeds fetched at the users expense (and
therefore Mongrels)
16. Issues
• No longer supported by author
• Feeds fetched at the users expense (and
therefore Mongrels)
• Feeds are cached locally but are parsed on
request
17. Issues
• No longer supported by author
• Feeds fetched at the users expense (and
therefore Mongrels)
• Feeds are cached locally but are parsed on
request
• Known probs when scaling (search on
Google)
20. The Result
• Occasional slow loading pages that include
third party feeds
• Stale feed items
21. The Result
• Occasional slow loading pages that include
third party feeds
• Stale feed items
• Inconstant feed items
22. The Result
• Occasional slow loading pages that include
third party feeds
• Stale feed items
• Inconstant feed items
• No good!
23. The Challenge
• To keep content fresh, push traffic to WWR
and out to the blog owners
• Different feeds and sources to consider:
Flickr, Twitter, Blog, Delicious
• Each need to display in multiple places in
many ways
• But also want to do some funkier stuff (as
you’ll see a bit later)
25. DRB
Basic building block of all other Ruby
distributed libs.
“DRb literally stands for quot;Distributed Rubyquot;. It is a library that allows you
to send and receive messages from remote Ruby objects via TCP/IP. Sound
kind of like RPC, CORBA or Java's RMI? Probably so. This is Ruby's simple
as dirt answer to all of the above.”
http://chadfowler.com/ruby/drb.html
26. Quick DRB Example
Server
Client
require 'drb'
require 'drb'
class TestServer DRb.start_service()
obj = DRbObject.new(nil, 'druby://localhost:9000')
def doit # Now use obj
quot;Hello, Distributed Worldquot; p obj.doit
end
end
aServerObject = TestServer.new
DRb.start_service('druby://localhost:9000', aServerObject)
DRb.thread.join # Don't exit just yet!
27. Quick DRB Example
Server
Client
require 'drb'
require 'drb'
class TestServer DRb.start_service()
obj = DRbObject.new(nil, 'druby://localhost:9000')
def doit # Now use obj
quot;Hello, Distributed Worldquot; p obj.doit
end
end
aServerObject = TestServer.new
DRb.start_service('druby://localhost:9000', aServerObject)
DRb.thread.join # Don't exit just yet!
> ruby server.rb
28. Quick DRB Example
Server
Client
require 'drb'
require 'drb'
class TestServer DRb.start_service()
obj = DRbObject.new(nil, 'druby://localhost:9000')
def doit # Now use obj
quot;Hello, Distributed Worldquot; p obj.doit
end
end
aServerObject = TestServer.new
DRb.start_service('druby://localhost:9000', aServerObject)
DRb.thread.join # Don't exit just yet!
> ruby server.rb > ruby client.rb
“Hello Distributed World”
29. Basics
• Server
• Clients / Workers
• Communicate via messages
http://en.wikipedia.org/wiki/Distributed_computing
30. BackgroundRB
• Ruby job server and scheduler.
• Integrates with Rails
• Quite complex
• Some issues between versions but many
favor it above the other libs
• Most well known
http://backgroundrb.rubyforge.org/
31. Starfish
• Inspired by Google’s MapReduce
• Easy to understand code
• Stability?
• No longer supported by author?
http://rufy.com/starfish/doc/
32. reliable-message
• Solid library
• Easy to understand API
• Bit more involved to setup
• Can be integrated with Rails
• On going development
http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/Ruby/ReliableMessaging
33. AP4R
• Asynchronous Processing for Ruby
• Lesser known lib from Japan (new kid on the
block)
• Integrates with Rails
• Built on top of reliable-message
34. AP4R
• AP4R, Asynchronous Processing for Ruby, is
the implementation of reliable asynchronous
message processing. It provides message
queuing, and message dispatching.
• Using asynchronous processing, we can cut
down turn-around-time of web applications
by queuing, or can utilize more machine
power by load-balancing.
35. AP4R Features
• Business logic can be implemented as simple Web applications, or ruby code, whether it's called
asynchronously or synchronously.
• Asynchronous messaging is reliable by RDBMS persistence (now MySQL only) or file
persistence, under the favor of reliable-msg.
• Load balancing over multiple AP4R processes on single/multiple servers is supported.
• Asynchronous logics are called via various protocols, such as XML-RPC, SOAP, HTTP PUT, and
more.
• Using store and forward function, at-least-omce QoS level is provided.
36. AP4R Process Flow
• A client(e.g. a web browser) makes a request to a web server (Apache, Lighttpd, etc...).
• A rails application (a synchronous logic) is executed on mongrel via mod_proxy or something.
• At the last of the synchronous logic, message(s) are put to AP4R (AP4R provides a helper).
• Once the synchronous logic is done, the clients receives a response immediately.
• AP4R queues the message, and requests it to the web server asynchronously.
• An asynchronous logic, implemented as usual rails action, is executed.
37. AP4R example
Hello World app comes with AP4R to get
you started.
Nice guide also here
http://rubyforge.org/frs/download.php/13312/AP4R_Users_Guide_EN.pdf
39. Rinda
• Rinda::Ring allows DRb services and clients
to automatically find each other without
knowing where they live.
• DRb servers register themselves with a
RingServer which allows clients to find the
servers they need. Many servers may
register themselves with the RingServer. The
DRb servers don't need to run on the same
machine.
http://segment7.net/projects/ruby/drb/rinda/ringserver.html
40. RingyDingy
• RingyDingy automatically registers a service
with a RingServer. If communication between
the RingServer and the RingyDingy is lost,
RingyDingy will re-register its service with
the RingServer when it reappears.
http://seattlerb.rubyforge.org/RingyDingy/
41. Feeds in WWR
AP4R
Server
Feed
@queue
Queue
Feed Fetcher Feed Fetcher Feed Fetcher
1 2 N
43. Key points
• The Feed Queue fetches the urls of stale
feeds
• Each worker (client) has the Rails
environment loaded
44. With this solution
• Can scale as demand grows
• Flexible for any type of feed data
• Still - room for improvement
45. Possible Improvements
• Automatic spawning and killing of workers
as queue size grows or decreases
• Better handling of feed errors
• Dynamic polling intervals based on user
defined prefs or some intelligent logic.
46. When to go distributed?
• Long running process or task
• Fetching external data
• Complex computations
• .... that can be broken into chunks or work
• You care about the user experience
47. Pitfalls
• Dependencies
• `connection closed' errors on Mac (IPV6) -
change all refs of localhost to 127.0.0.1 to
avoid. (had to patch reliable-message)
• Terminology to understand
• Memory requirements
48. Do you need
distributed?
• Maybe you would be better scheduling
instead?
• http://www.igvita.com/blog/2007/03/29/
scheduling-tasks-in-ruby-rails/
57. Thanks!
http://www.dsc.net
http://www.workingwithrails.com
Blog: http://beyondthetype.com
Enjoyed the talk? Recommend me on WWR
http://workingwithrails.com/person/5152-martin-sadler