Sahana
Disaster Management System
Google Summer of
Code 09
P2P
Synchronization of Shana Servers
(ShanaPHP <>
ShanaPy, ShanaPHP <> ShanaPHP, ShahaPy <> ShanaPy)
Name Syed
Hasanat Ali Kazmi
Email
hasanatkazmi@gmail.com
Freenode
IRC Nickname: hasanatkazmi
IM
gtalk/jabber: hasanatkazmi@gmail.com
Biographical
information:
Present
Address: Room 102, M3, Lahore University of Management Sciences,
Lahore, Pakistan.
Country:
Pakistan
TimeZone:
GMT+5
Education:
BSc Computer Science Major, Junior year.
CV: http://hasanatkazmi.googlepages.com/hasanat-cv.pdf
Overview
of your exposure to similar technologies and/or FOSS in general:
I
have hands-on experience of web development using LAMP (both Python
and PHP), MVC, Python, Linux (Debian), web services, .NET.
“Why
would you like to help the Sahana project?”:
I
hail from Muzaffarabad which was the center of 2005 Pakistan
earthquake. I have seen miseries and Sahana helping on ground.
Besides emotional reasons, I have interest in cloud computing and
clustering, synching two servers is first step in clustering and I
want to take first step.
Have
you reviewed the Important Dates and Times?
Yes,
I will be free in the whole coding part of project except first few
days (up to 5 days)
Do
you have any significant conflicts with the listed schedule? If so,
please list them here.
No
Will
you need to finish your project prior to the end of the GSOC?
No
Are
there any significant periods during the summer that you will not be
available?
No
(except for the first few days, around 4 days)
Syncing
two Sahana servers is basically about syncing their databases. It
cannot be simple database dump because we also have different
database semantics at SahanaPHP and SahanaPy. So first I'll create a
module in SahanaPy which exports database as in SahanaPHP. The idea
is to build upon presently exporting mechanism. I will also extend
Synchronization module in each Sahana server to add automatic
synchronization by adding a new feature called 'Sync Pool' (As shown
in diagram). It will basically list all those pools which that
instance of Sahana belongs to. A user (admin) can join new pools,
edit the relationship with different pools (like type of data to
import, time lag for each sync attempt etc).
Software
which implements Zero Configuration runs continuously in the
background. It has a search mechanism which is implemented by two
tasks, 1: It exposes services with the name of all those pools which
the instance of Sahana has subscribed to. 2: It 'searches' all those
Sahana instances which have also subscribed to its pools. Once it
finds servers which are in the same pool as it is, it fetches their
database using Synchronization methodologies currently present (HTTP
GET/POST). User can also specify server by IP/URL. Then it parses the
XML and adds unique records in the database. At the time of
installation, software needs to know administrative password of
database.
This
section is to provide the detail of your project proposal. Take as
much space as is necessary.
Project
Deliverable - What is the essence of the project? What capability are
you looking at adding to Sahana that will expand its capability for
emergency/disaster management?
The
core of the project is to implement automatic syncing between
different instances of Sahana servers irrespective of the fact that
whether they are PHP or Python servers. Syncing of Sahana servers is
very important because there are many cases when multiple instances
of Sahana servers are installed, e.g. in the event of Kashmir
earthquake, different groups leave for different villages. Each group
has a laptop with instance of Sahana server installed. Other group
members access the server using PDA and add data on the go. When all
groups return to the base camp, they have to export their data
manually and the central server imports the data manually. This
process needs to automate. This will dramatically increase the
capability of system in emergency as everyone will be able sync with
other servers on the go. Technical requirements of the system will
also be very low, as in most of the cases, admin/user needs not to
know IPes of other instances of Sahana (though this option will also
be provided to override automatic search of IP adresses).
Project
Justification - What is the relevance of your idea to the project?
Why do you think it’s important to *have* this idea integrated
to the Sahana system.
This
project is very much relevant to Sahana and the community has been
taken in confidence about this project. Community expressed a dire
need for automatic syncing. This project will enhance the
capabilities of the Sahana in practical situation as it will make two
different types of servers compatible with each other and will
streamline synching between them.
Implementation
Plan - How are you going to implement your project? Use this section
to expand in as much detail as possible how it should be constructed.
(Refences
for explanation: http://hasanatkazmi.googlepages.com/sahana-mod.jpg
http://hasanatkazmi.googlepages.com/sahana-rough.jpg
)
Here
is step to step implementation of this project:
1.
Manual synchronization module in SahanaPy:
The
first part of the system would be to make almost same synchronization
module in SahanaPHP as present in SahanaPy. SahanaPy is based on
web2py which provides smooth access to database. Converting the data
to XML would be carried out using XML libraries in Python. I will
implement same XML conversion mechanism as present in SahanaPHP so
that it is totally compatible with each other. Similarly importing
feature will also be implemented where XML will be parsed and data
will be added in database. I will closely follow all conventions
followed in SahanaPHP so that if a programmer with experience on
SahanaPHP starts developing on SahanaPy, he feels familiarity with
the system.
An
important point of this system is that this can be embedded in a
programmatic system and data can be imported and exported
automatically. As importing and exporting is completely based on HTTP
GET/POST requests, this can be embedded in automatic system, as I
will do.
2.
Sync Pools:
After
implementing manual synching, I will expand this module in both
versions of Sahana and will introduce “Sync Pool” in both
types of servers. As shown in the figure, Sync pool defines the
relation with that instance of Sahana has with other instances of
Sahana.
What
is sync pool?
Sync
pool can be basically understood as a table. A user (say admin) of a
system can add all those pools which he/she wants to join. He also
checks those types of data which he wants to import from that
particular pool. He also enters a time lap after which he wants to
look for new data. For simplicity, consider there is only one pool
which he can join. When he joins that pool, the system will raise a
flag in the network that server is in that pool, moreover, system
will also look for all those machines with that flag. When found,
system will fetch the type of data which he has checked. Software
will repeat the process after each said time.
Sync
pool will also give choice to add IP address / URI of the server in
case if the user wants to override automatic server searching
process.
A
practical use case: NGO wanting to sync all instances of Sahana.
There
is an NGO which has deployed many instances of Sahana, some run on
Python, other run PHP and NGO wants that all of its Sahana servers
must sync. In that case, the admin of NGO will instruct admins of all
instances of Shana that they must join a pool called "abc NGO".
When everyone (including that admin of NGO) joins the pool, checking
all option for data import, it will result in complete
synchronization between the all instances of that NGO. If the admin
wants that it should be one way synchronization, i.e. all data must
come to his main server and no data should be sent to field servers
(as this is not required in some cases), he can ask admins of Sahana
instances to uncheck all data import option and checks all options at
his main server, this will result in all data coming towards his main
server only.
Please
note that once you join a pool, you expose you data for importing at
other machines of the same pool.
How
will this be implemented?
The
whole system is based on Zero Configuration. Sahana community has
already worked on it. Zero Configuration automates the process of
finding machines which offers particular service.
Please
note the point that for this system to work, it has to continuously
expose services to the network, this can only be achieved if we have
software running continuously. So, it cannot be implemented as a
website, though we can append its executable in the executables of
server to make its execution apparent (I do not promise if I will do
this as my concentrations will not be on this part)
An
example can show how it works.
Suppose
we have multiple instance of Sahana where each instance has joined a
pool called “abc NGO”, checking all data type to be
imported after x hours. The software which now runs as a daemon will
interact with the network in two ways.
a:
It will expose service with the name of “abc NGO” to the
network.
b:
It will continuously look to servers which have exposed service with
the name “abc NGO”, for all those servers which expose
this service, it will fetch checked type of data (as described in
sync pool) using manual synchronization module not this software. As
I told earlier, you can fetch data using HTTP requests.
One
of the most important hatches in network communication is data
unavailability, server downtimes etc, therefore a log will be
maintained logging all data communication. The current data
synchronization module in SahanaPHP provides sync history that will
also be implemented in SahanaPy. As the system will use same module
for seamless data synchronization, therefore it will automatically
list past data synchronizations.
Documentation
is essential part of a software specially FOSS. A standard
documentation will be done with the code.
Future
Options - Identify some aspects of the project that may not be within
the scope of this submission, but could form the basis for future
work that would build upon the outcomes of your project.
The
implementation will be generic and expandable. Potential future
buildups include:
In
case Zero Configuration doesnt seem to meet the challenges (as feared
by someone on the mailing list), it can be plugged off as it will be
implemented as a 'plug-in' to the system and new codebase can be
introduced.
This
project will create the basic framework for inter server
communication, in future if the need for exchanging other data
between servers is required, that can be implemented on its
infrastructure. It can provide API for the system (SahanaPHP and
ShahaPy) to communicate with remote servers.
For
much larger system, exchanging large XML files can be impractical;
therefore, binary diffs can be introduced in the present
architecture.
Relevant
Experience - Please list all experience you have that is directly
relevant to the proposed project, and how they would help you deliver
the project. If you have contributed to the Sahana project
previously, please clearly outline your contributions.
I
have deep knowledge and expertise with PHP, Python, Linux, MVC. This
project basically requires Python, web2py, PHP, web development and
basic networking understanding. I have extensively coded in Python,
PHP, and made many websites. I have used MVC mode of web development
in JSP. So, I believe I am completely equipped with the knowledge
required to complete this project.
Work
already undertaken - What research have you undertaken in this area
in advance? (These can just be bullet points and are not required to
follow the SMART methodology)
I
have gone through web2py as I haven’t previously worked on it.
I
used python implantation of Zero Configuration to test if it works,
it worked across my router.
First trimester (20
April - 22 May) - identify the SMART goals you have for the
community bonding period. Most of these are likely to revolve around
further scoping of the project with the community, engaging with the
community, and updating and finalizing the project plan.
(I
have already discussed my idea on sahana-dev, but it definitely needs
further development)
Due
Data SMART Goal Measure
27
April Understanding of XML conventions / protocol used for XML
conversion write-up describing step by step method used for
conversion to XML
3
May Understanding of present work done by community for ZeroConfig
a write-up describing present ZeroConfig status in the project and
step by step description that how it is going to be used
10
May updating the proposal for last time with feedback from community
Confirmation email from mentor in support of all changes
22
May discussing my whole implementation plan with community, taking
their notes, giving final touch to project workflow a
technical write-up with step by step description on how the coding
will be done, confirmation email from mentor
Second trimester
(23 May - 6 July) - identify the draft SMART goals you have for the
first half of the project. These will be used to assess project
process and form the basis for the mid-term evaluation.
Due
Data SMART Goal Measure
8
June import part data of synchronization module completed demo
of import part
15
June beta version web2py module for manual data
synchronization a working (though buggy) and complete demo of
manual synchronization module for SahanaPy
6
July manual synchronization module completed. Demo
final version of web2py synchronization module
Third trimester (7
July - 10 August) - identify the draft SMART goals you have for the
second half of the project. These will be used to assess the whole
project and in conjunction with the mid-term goals, form the basis
for the final evaluation.
Due
Data SMART Goal Measure
11
July sync pool implementation in SahanaPHP without the software
Demo of working system
25
July sync pool implementation in SahanaPy without the software
Demo of working system
5
August beta of complete software Demo
10
August Final version/ Bug free and totally integrated Final
Demo, Confirmation from mentor

0 comments:
Post a Comment