Web Scraper To Excel



If you’re a fantasy nerd like me, having access to sports stats can be extremely useful in giving you a leg up on your competition!

By sorting players by certain stats, you can find hidden gems for bargain prices, especially in later rounds of the draft. Think Moneyball for fantasy sports.

Web scraping tools are also known as Web harvesting tools or Web data extraction tools. Web Scrappers use intelligent automation to extract useful information from the websites. These tools help you to collect huge data from the websites on a large scale seamlessly. These tools allow us to download data in the form of Excel, CSV, or XML. Jan 20, 2021 The scraper is another easy-to-use screen web scraper that can easily extract data from an online table, and upload the result to Google Docs. Just select some text in a table or a list, right-click on the selected text and choose 'Scrape Similar' from the browser menu.

In this tutorial, we will demonstrate how to use ParseHub to extract stats on every quarterback in the NFL from the NFL website.

In this tutorial, we will show you how to:

  1. Scrape data from a sports stats websites.
  2. Import the data into a Google Doc - for reference and to share with friends

Step 1: Use a web scraper to scrape data from a sports database

1. Extract the players’ names

  1. Download ParseHub for Free and start up the desktop app
  2. Go to the landing page for NFL quarterback stats.
  3. Using the Select tool, select the first player name on the list by clicking on it.
  4. Click on another player name and they will all be selected and extracted for you.
  5. Rename this selection to player.

Now for the fun part. For every player name that you’ve extracted, you want to create a relative selection to their stats, and extract them as well. Say, for example, you want to know the team that the QB is on, their number of completed and attempted passes, total yards, touchdowns, and interceptions thrown.

You can collect all of that data by following the steps below.

2. Collect the team names

  1. Click on the “plus' button next to the 'Begin new entry in player' command.
  2. Add the relative select command to create a link between the player name, and their team name. Click on the player name and click on the team name. All of the team names should be selected for you - with an arrow pointing from the player names to the team names on the entire page.
  3. Name this selection team. The name and URL will be extracted for you.

Now you can just repeat steps 1 to 5 in the 'Collect the team names' section for each stat that you want. I will show you how to do one more, and grab the interception stats for each QB.

  1. Click on the “plus' button next to the 'Begin new entry in player' command and add the relative select command again.
  2. Click on the player name and click on the number of interceptions that they have. Now all of them will be selected.
  3. Name this selection int.

Here’s a sample of the data that we collected by repeating the above steps for team, interceptions, yards, attempts, completions, and touchdowns.

This selects only the first page of players, however. Once you’ve created the relative selections for the stats you want, you’ll need to take one final step to get ParseHub to go through all the pages of QB’s using pagination.

Step 2: Scrape stats from all of the pages

  1. Click on the 'plus' button next to the 'Select page' command.
  2. Add a Select command and click on the 'Next' button on the top right corner of the table.
  3. Name this selection next.
  4. Create a Conditional command to ensure that the selected link is actually the 'next' button. Enter $e.text 'next'.
    Note: This is not usually required; the NFL's website just works differently than most. Usually, you would just select the “next” button, and use the Click tool to go to the next page.
  5. Open the command menu by clicking on the 'plus' button next to the Conditional command, and add a Click command.
  6. ParseHub naturally navigates by “clicking”, which is what we want. Choose to Go to Existing Template main_template. This is because you want ParseHub to do the exact same thing that we instructed it to do on this page on the other pages.

Run your project and download your data

  1. Click on the 'Get Data' button
  2. Click on the 'Run' button
  3. Click on the 'Save and Run' button
  4. Wait for ParseHub to extract all of the data from the page
  5. When you see 'CSV' or 'JSON' appear under the 'Actions' heading - click to download your data.
  6. You will also get an email when your project is finished scraping data. There will be a link to your data.

And there! You’re all done. Run your project, extract your data, and you’ll have all the stats you want at your fingertips.

Step 3: Set up the ParseHub API in Google Sheets

To upload your own data into a Google Doc instead of downloading data in Excel from the ParseHub extension just one time, use the IMPORTDATA function to import the data into Google Sheets.

Every time the project runs and scrapes the website, this Google Sheet will be uploaded with new data. You can also schedule ParseHub to run and grab your data consistently. When ParseHub scrapes data on a schedule, the data in your Google Doc will also be refreshed.

1. Find your API key and project token:

Web Scraper To Excel
  1. Open the project we just worked on
  2. Find your project token in the 'Settings' tab of the project
  3. Save the project to see your Project Token.
  4. Find your API key by click on the 'profile' icon in the top-right corner of the app.
    Click 'Account' and you will see your API key listed

2. Open Google Sheets and create your IMPORTDATA function

  1. Open a new Google Sheet
  2. Click on the A1 cell and type in =IMPORTDATA()
  3. In the =IMPORTDATA() function create your URL like the following:

=IMPORTDATA('https://www.parsehub.com/api/v2/projects/PROJECTTOKEN/lastreadyrun/data?apikey=API_KEY&format=csv')

  1. Replace the PROJECT_TOKEN with the actual project token from the 'Settings' tab of your project.
  2. Replace the API_KEY with the API key from your account.

We created this URL based on the ParseHub API reference. Take a look at it for more information.

If you did everything correctly, you should see the data from your project appear almost immediately.

ParseHub can handle much more complex projects and larger data sets, and you can set up any kind of project you want using this or similar methods. Good luck and happy Fantasy-ing!

Web Scraper To Excel

[This post was originally written on March 16, 2015 and updated on August 6, 2019]

  • 1
    Add
    To Desktop
  • 2
    Click
    on target page
  • 3
    Download
    .xlsx file

Developers go through the pain of trial and error until they achieve more reliable data schema. With Listly, they can skip the pains. They don't have to be sitting on the chair for hours or days to inspect the web pages. Listly always gives the best result ever, even in complex and unpredictable structures. No coding, No stress.

Retailer, Marketer, Sales, Analyst, Researcher, and so on. Non-developers needs frequently more data in their field. With Listly, everyone can get data just in time. They can stop wasting time repeating copy-and-paste. They don't need to ask programmers for help and wait for. In the end, they can focus on the real work.

Scheduler.

Proxy Server.

Parallel Extraction.

Wait for Loading.

Error Screenshot

Auto Scroll.

Auto Save.

Web Scraper To Excel Software

Auto Click.

HTML Fileboard.

API Integration.
CSV, JSON