GEDCOM Crawler
Download your entire family tree
version 1.1
© 2006 Antone Roundy
All Rights Reserved
© 2006 Antone Roundy
All Rights Reserved
More Information sources
I have not used these sites, so I can't vouch for them personally, but they are popular, so I imagine you might find them useful.
I have not used these sites, so I can't vouch for them personally, but they are popular, so I imagine you might find them useful.
- Government-Records.com
Marriage, death, civil, census records, etc. - Records Registry
Ancestory archives, cemetery records, census, family histories, marriaage records, vital records, etc. - GovernmentRegistry.org
Birth, marriage, vital records, etc. - People-Files.com
Birth, death, marriage records, etc. - BirthRecords.ws
Birth records, adoption records, ancestors, etc.
"197 GEDCOMs downloaded, 4832 to go...Ugh!"
Have you ever tried to download all the GEDCOMs for your ancestors to import into PAF?
How many did you finish before you gave up?
I'd finished all of maybe 2 or 3 and immediately realized that doing it manually would take forever.
So I put off getting started for years...until I realized that I could write a script to download them all automatically.
GEDCOM Crawler generates a single GEDCOM file containing all of the information from the LDS Ancestral File for the ancestors of the indicated family. In about 45 minutes, it downloaded data on 5,525 families in my family tree, stretching all the way back to 124 A.D.!
NOTE: Because the data comes from the Ancestral File, it does not contain LDS ordinance data. Look for other tools to help you import this data into PAF.
GEDCOM Crawler generates a single GEDCOM file containing all of the information from the LDS Ancestral File for the ancestors of the indicated family. In about 45 minutes, it downloaded data on 5,525 families in my family tree, stretching all the way back to 124 A.D.!
NOTE: Because the data comes from the Ancestral File, it does not contain LDS ordinance data. Look for other tools to help you import this data into PAF.
Genealogy News
Newsfeed display by CaRP[CaRP] Can't open remote newsfeed [401].
Maine Man Who Cared for Famed Old Tree Dies at 103
16 May 2012 at 4:01pm
This has nothing to do with genealogy, but is a heart-warming story that I think should be shared. Frank Knight's decades-long battle to save New England's tallest elm served as an inspiring tale o...
[from Eastman's Online Genealogy Newsletter]
Mocavo Raises Another $4 Million in Venture Capital
16 May 2012 at 10:15am
I have written a number of times about Mocavo, the genealogy search engine. My past articles may be found by starting at http://goo.gl/MtdPs. Mocavo has now announced it has raised $4 million in it...
[from Eastman's Online Genealogy Newsletter]
19th Century Graveyard Found Below California Construction Site
16 May 2012 at 9:56am
Construction at a portion of Santa Clara Valley Medical Center has stopped in San Jose, Calif. because crews have unearthed pine boxes filled with the bodies of those whose families couldn't afford...
[from Eastman's Online Genealogy Newsletter]
Wal-Mart and the Slave Cemetery
16 May 2012 at 8:59am
The graves of about 80 slaves in Florence, Alabama are being threatened by a planned new Wal-Mart store. The slaves were owned by Gen. John Coffee, a friend of President Andrew Jackson and a survey...
[from Eastman's Online Genealogy Newsletter]
Free Anti-Virus Software for Macs
15 May 2012 at 6:55pm
Once upon time, such as last year, any time you discussed the subject of computer viruses, the Mac owners would always say, "Macs don't get viruses!" They were mostly correct. To be sure, there wer...
[from Eastman's Online Genealogy Newsletter]
Genealogy for Juniors Program Class in Youngtown, Arizona
15 May 2012 at 2:12pm
Perhaps we should see more classes like this one. The following announcement was written by the West Valley Genealogical Society: The West Valley Genealogical Society, 12222 N. 111th Ave., Youngtow...
[from Eastman's Online Genealogy Newsletter]
RootsTech 2013 Call for Presentations
15 May 2012 at 12:58pm
The following announcement was written by the organizers of the RootsTech 2013 conference: March 21-23, 2013 Salt Lake City, Utah The world of genealogy is changing at an ever-increasing rate throu...
[from Eastman's Online Genealogy Newsletter]
Requirements
Because of the amount of bandwidth GEDCOM Crawler uses when downloading large family trees,
I do not run it on my server on behalf of anyone but family members.
GEDCOM Crawler can run on a computer that meets the following requirements:
- Internet connection (preferably high-speed)
- Perl
- Command line access (ie. if you are connecting to the computer over a network rather than sitting at the terminal, you will need something like Telnet or SSH access)
License
By downloading GEDCOM Crawler,
you agree to the terms of this license.
- Subject to the terms of this license, you may download and use GEDCOM Crawler at no charge.
- You may make modifications to your own copy.
- You may NOT redistribute GEDCOM Crawler in whole or in part, nor modified copies or other scripts based on GEDCOM Crawler, whether commercially or for free without prior written consent. Refer people to this website instead.
- This script is provided as-is with no warranty whatsoever. Use it at your own risk. If you live in a jurisdiction that does not allow complete disclaimer of warranty, then you may NOT download nor use GEDCOM Crawler.
Known Limitations
- Only works with the LDS Ancestral File. I'm looking into making it work with RootsWeb too.
- An individual may have more information about them listed in the GEDCOM file for one family than another. GEDCOM Crawler neither merges the information nor attempts to determine which record is the most complete, but simply uses the first one it finds. I'll try to improve on that in a later version.
Support
Sorry, I'm way too busy--no technical support is available at this time.
Instructions
- Download and unzip GEDCOM-Crawler.zip (By downloading GEDCOM Crawler, you agree to the above license terms). NOTE: The files use UNIX style linebreaks. When opening the files on other operating systems, ...things may look weird. I may make additional packages with different linebreak styles later.
- Open gedcom-crawler-conf.pl in a text editor like Notepad, or better yet, a programmer's text editor.
- If the computer on which GEDCOM Crawler will run has a host name (for example, if you are going to run it on your webserver), enter the host name in $myHostName. Otherwise you will need to find out the IP address of the computer and enter it in $myHostIP. (If you enter the host name, you may leave $hostIP blank.)
- Save your changes (be sure to save as a plain text file).
- If you are going to run GEDCOM Crawler on a different computer than the one you have it on, upload all four files to that computer.
- If necessary, make all four files executable.
For example, on a UNIX or Linux computer, you might type the following command on the command line from within the directory where the files are located:
chmod 700 gedcom-crawler* - Find the family ID for the most recent ancestor on the line whose data you wish to download:
- Go to the Ancestral File Search Form
- Find the ancestor
- Go to their individual record
- Copy the "Family" link on the right side of the page across from their name (don't go to that page and copy the URL in your browser's address bar--that won't be the right one--instead, right click the "Family" link and select "copy URL" or something like that from that menu)
- The number after "familyid=" in that URL is the family ID.
./gedcom-crawler.pl 123456
Here, the fun begins!
GEDCOM Crawler will start downloading and combining GEDCOM files for your ancestors.
Each time it downloads a file, it will tell you how many families' data it has processed,
and how many family IDs it has found but not processed.
Watch both numbers go up for a while as it finds more and more families!
After the numbers, it will display one dot for each line in the GEDCOM file--some families are bigger than others.IMPORTANT: GEDCOM Crawler will create a file whose name is the family ID specified on the command line plus ".ged". If you already have a file by that name in the same directory as GEDCOM Crawler, it will be overwritten.
./gedcom-crawler.pl 567890 x
This will prevent GEDCOM Crawler from clearing the family IDs it has already downloaded from the database../gedcom-crawler-skip-families.pl gedcom-file.ged./gedcom-crawler-skip-families.pl gedcom-file.ged x