This page is outdated, please go to here for new documentation!

How to scrape all data from a table?

Here is no much different as this tutorial Scrape multiple sets of data from a yellow page . You should select every row in table in "scrape page" node with "group select", then add "capture content" to capture every field in rows.

The only different is row in table is DOM "tr", and it's invisible, and can't be selected by clicking it on page. So you shoud select a "td" and then use "expand select" to select a "tr". The steps detail:

1. Select the "scrape page" node in scene, check  "Extract multiple sets of data", because you want to scrape multiple row data.

2. Click "select target", and  select a "td" target in table on page. The xpath will like "xxx/xxx/td".

3. Click "expand select" to select "tr". Now the xpath should be like "xxxxx/xxxx/tr".

4. Check "group select", then click another row in table to select all "tr".

After do this, you can add 'capture content" nodes under it to capture every field in rows.

 

Error of "Fminer cannot be opened because it is from an unverified developer" on Mac OS

1. Right-click (or control-click) the application in question and choose “Open”.

 2. Click the “Open” button at the next dialog warning to launch the app anyway.

 

Error: Failed loading page (Protocol "javascript" is unknown)

If you encounter this error when use "open link" action to open a link, means the link is not a real link, it's a button for ajax/javascript. You should use "click" action for the link. If it's a "next page" link, see this tutorial for a loop: http://www.fminer.com/23-click-next-page-button-ajax-pages/.

 

How to scrape pages without "next link"

Some pages have no "next link",  just show 1, 2, 3, 4, 5, 6 ,7 ,8 ,9 10, when got to page 10, it will jump to 10, 11, 12, 13, 14. For this kind sites:

1. If the links is the real links and can be used with "openlink(s)", you can select all pages links(2,3...) as the next links, FMiner will open this links recursively not duplicate.

2. If the links is ajax button, see this post http://www.fminer.com/forum/topic/155/.

 

How to go back?

You can use runjs action, and with code: window.history.back()

 

How to scrape hidden  DOM text?

Please change "extract type" to "DOM attribute" -> "inner text" to try, if can't work, you have to change "extract type to "html source" to scrape the source code of DOM.

 

How to scrape data with regular expression?

Like normal scraping action, add scrape page and capture content actions, then change extract type to regular expression, input the re string to the control. For example: 

\b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b  

will scrape the email address. 

Because FMiner will run the re string for the target html code, you should keep target xpath to /html for the whole page. FMiner run the re string with javscript fuction RegExp.match with parameter "igm", and you must make sure the string is correct.

 

I can't select all the contents I want to scrape.

FMiner locate the targets with XPath and Postion, see select target.

Normally you can select all targets with group select. For complex selections, you can input XPath manually, here's a basic tutorial of XPath: http://zvon.org/xxl/XPathTutorial/General/examples.html

You can also create more scrap page actions for the different block groups if you really can't select all the contents you need in a selection.

 

How to see the html code of a page?

FMiner has a tool of web inspector, when disable record and right click the page select option "Inspect" in the menu, it will show. You can see the page code and DOM tree structure on it. It's helpful to write XPath of targets manually.

 

 

How to scroll down a page?

Add an action “run javascript in browser”, and input code: 

window.scrollBy(0,20000)

If you want to wait some time for ajax to update the page, you should add a wait time node after it.

From 8.00 version, FMiner add a new action of "scroll down" , you can add this action directly.

 

Firing a keyboard event

For some pages without searching button, must input Enter key to get result, "fill" action may not work, you can use "runjs" code and write some code like this to fire the event:

evt = document.createEvent("KeyboardEvent");
evt.initKeyboardEvent("keyup",  true, true, window, false, false, false, false, 13, 0)
document.getElementById("suggestBoxEQ").dispatchEvent(evt);

"keypress" can be "keydown" and "keyup".

 

The results FMiner scraped miss some pages

There are two possible reasons:

1.The pages have some dynamic contents and have not enough waiting time to update them.

For this reason, you should change action interval or target wait time in project settings dialog bigger.

2.The targets in these page can't be select correctly by the assigned XPath.

For this reason, you should change XPath and select the targets again for these pages, and make more test for more pages.

(For debug a project, see the logs widget, be careful for the warn and error messages, and there will be some links for the pages may have problems, you can click the links to see the pages' codes and screenshots)

 

 

The program crash sometimes for big site!

FMiner user a browser core to scrape pages, as a general browser, it may crash when accessed a lot of pages. This is the inevitable question. For small project, you can open the project again, and click "resume" continue to run it. And for big projects need run serval days, you should use a tool to monitor it, restart and resume run the project when it crash.

You can use this tool http://w-shadow.com/blog/2009/03/04/restart-on-crash/ to monitor FMiner. It's a freeware.

The steps:

1. Make a bat file and input text "mainwin.exe --resume your_project_path.fmpx", and copy it to FMiner's folder. Here we name it "runprj.bat".

2.Config "Restart on Crash" like this:

 

3. Then run the project, it will restart and resume run when crash.

 

Move license to another computer

When you want to move license to another computer, you must disable the license of the old computer at first, to do this, open about dialog, and click button of transfer license to another computer. Then you can use this license on other computer.

Note: A license can not be transferred more than once in a week, or we will ban the license.