Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs

This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Anuoluwapo Balogun

Today, I want to show you how you can scrape data from linkedin using Proxycurl api, Python programming and nodejs.

Let's scrape data using python programming and the library request.

I am going to use the Proxycurl Company api to get the Employee Count Endpoint

install the package request

!pip install requests

let's get our Proxycurl api create an account with Proxycurl and generate your api.

Let's count the number of employees working at Apple.inc

Using the library

import requests

api_endpoint ='https://nubela.co/proxycurl/api/linkedin/company/employees/count/'

api_key = 'YOUR_API_KEY_HERE'

header_dic = {'Authorization': 'Bearer ' + api_key}

params = {
    'linkedin_employee_count': 'include',
    'employment_status': 'current',
    'url': 'https://www.linkedin.com/company/apple/',
}

response = requests.get(api_endpoint,
                        params=params,
                        headers=header_dic)

The output response is:
{ 'total_employee': 94262, 'linkedin_employee_count': 567686, 'linkdb_employee_count': 94262 }

Let's try to count the number of employees working at twitter

import requests

api_endpoint = 'https://nubela.co/proxycurl/api/linkedin/company/employees/count/'
api_key = '3HqZGXdoejPB8YYT4KRb3w'
header_dic = {'Authorization': 'Bearer ' + api_key}
params = {
    'linkedin_employee_count': 'include',
    'employment_status': 'current',
    'url': 'https://www.linkedin.com/company/twitter/',
}
response = requests.get(api_endpoint,
                        params=params,
                        headers=header_dic)

The output is

{'total_employee': 7472, 'linkedin_employee_count': 7992, 'linkdb_employee_count': 7472 }

You can try this with as many companies as possible

Next let's try scraping data from linkedin using Proxycurl and Nodejs

Create a folder directory

cd c:\\User\user\Folder name

Build file package

npm install express axios dotenv

or with Yarn

yarn add express axios dotenv

Generate API key from proxycurl

API_KEY = 'YOUR_API_KEY_HERE'

Code snippet

import express from 'express';
import axios from 'axios';
import dotenv from 'dotenv';

const app = express();

dotenv.config();

app.listen(8000, () => {
    console.log('App connected successfully!');
});

// Getting Company's job listing

const TWITTER_URL = 'https://www.linkedin.com/company/twitter/';  // Line 1

const COMPANY_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/company';

const JOBS_LISTING_ENDPOINT = 'https://nubela.co/proxycurl/api/v2/linkedin/company/job';

const JOB_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/job';

const companyProfileConfig = {  // Line 2
    url: COMPANY_PROFILE_ENDPOINT,
    method: 'get',
    headers: {'Authorization': 'Bearer ' + process.env.API_KEY},
    params: {
    url: TWITTER_URL
  }
};

const getTwitterProfile = async () => {  // Line 3
    return await axios(companyProfileConfig);
}

const profile = await getTwitterProfile();

const twitterID = profile.data.search_id;

console.log('Twitter ID:', twitterID);


const jobListingsConfig = {
    url: JOBS_LISTING_ENDPOINT,
    method: 'get',
    headers: {'Authorization': 'Bearer ' + process.env.API_KEY},
    params: {
    search_id: twitterID // Line 4
    }
}

const getTwitterListings = async () => { // Line 5
     return await axios(jobListingsConfig);
}

const jobListings = await getTwitterListings();

const jobs = jobListings.data.job;

console.log(jobs);

// Specific Job listing code snippet

const jobProfileConfig = {
    url: JOB_PROFILE_ENDPOINT,
    method: 'get',
    headers: { 'Authorization': 'Bearer ' + process.env.API_KEY },
    params: {
        url: jobs[0].job_url   // Line 1
    }
};

const getJobDetails = async () => {  // Line 2
    return await axios(jobProfileConfig);
};

const jobDetails = await getJobDetails(); 

console.log(jobDetails.data);

How the package.json should look like;

{
  "name": "nubela",
  "version": "1.0.0",
  "type": "module",
  "description": "",
  "main": "proxycurl.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "axios": "^1.1.3",
    "dotenv": "^16.0.3",
    "express": "^4.18.2"
  }
}

You can try scraping any data of your choice from Linkedin using Proxycurl Api

References
Proxycurl API
Proxycurl Documentation
Node js
Proxycurl Writer

This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Anuoluwapo Balogun

Print Share Comment Cite Upload Translate Updates

APA

Anuoluwapo Balogun | Sciencx (2022-10-26T14:06:28+00:00) Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs. Retrieved from https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/

MLA

" » Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs." Anuoluwapo Balogun | Sciencx - Wednesday October 26, 2022, https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/

HARVARD

Anuoluwapo Balogun | Sciencx Wednesday October 26, 2022 » Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs., viewed ,<https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/>

VANCOUVER

Anuoluwapo Balogun | Sciencx - » Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/

CHICAGO

" » Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs." Anuoluwapo Balogun | Sciencx - Accessed . https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/

IEEE

" » Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs." Anuoluwapo Balogun | Sciencx [Online]. Available: https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/. [Accessed: ]

rf:citation

» Scraping Linkedin Data with Proxycurl, Python Program, and Nodejs | Anuoluwapo Balogun | Sciencx | https://www.scien.cx/2022/10/26/scraping-linkedin-data-with-proxycurl-python-program-and-nodejs/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Related Posts