Friday, April 12, 2013

Data migration through browser automation

As we have already mentioned quite a few times, browser automation can have a lot of practical applications ranging from testing of web applications to web-based administration tasks and web scraping. The latter (scraping) is our field of expertise and Selenium is our tool of choice when it comes to automated browser interaction and dealing with complex, JavaScript-rich pages. 
    A very interesting scenario (among others) of combining our beloved web scraping tool, DEiXTo, with Selenium could be data migration. Imagine for example that you have an osCommerce online store and you would like to migrate it to a Joomla VirtueMart e-commerce system. Wouldn't it be great if you could scrape the product details from the old, online catalogue through DEiXTo and then automate the data entry labor via Selenium? Once we have the data at hand in a suitable format, e.g. comma/ tab delimited or XML, we could then write a script that would repeatedly visit the data entry online form (in the administration environment of the new e-shop), fill in the necessary fields and submit it (once for each single product) so as to insert automatically all the products into the new website.


    This way you can save a lot of time and effort and avoid messing with complex data migration tools (which are very useful in many cases). Important: of course we don't claim that migrating databases through scraping and automated data entry is the best solution. However, it's a nice and quick alternative approach for several, especially relatively simple, cases. The big advantage is that you don't even need to know the underlying schemas of the two systems under consideration. The only condition is to have access to the administrator interface of the new system.
    By the way, below you can see a screenshot from Altova MapForce, maybe the best (but not free) data mapping, conversion and integration software tool out there.


    Generally speaking, the uses and applications of web data extraction are numerous. You can check out some of them here. Perhaps you are about to think the next one and we would be glad to help you with the technicalities!

No comments:

Post a Comment