Imputation of missing information in worldwide patent data


Loading...

Date

2021-02

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

We present a general method for imputing missing information in the Worldwide Patent Statistical Database (PATSTAT) and make the resulting datasets publicly available. The PATSTAT database is the de facto standard for academic research using patent data. Complete information on patents is essential to obtain an accurate picture of technological activities across countries and over time. However, the coverage of the database is far from complete. Our data imputation method exploits detailed institutional knowledge about the international patent system, and we codify it in a SQL algorithm. We provide two datasets related to the imputation of missing country codes and missing technology classification. We also release the algorithm that can be easily adapted to impute other pieces of information that are missing in PATSTAT.

Publication status

published

Editor

Book title

Journal / series

Volume

34

Pages / Article No.

106615

Publisher

Elsevier

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Missing data; Patents; PATSTAT; Imputation; PostgreSQL

Organisational unit

06333 - KOF FB Innovationsökonomik / KOF Innovation Economics check_circle
02525 - KOF Konjunkturforschungsstelle / KOF Swiss Economic Institute check_circle

Notes

Funding

Related publications and datasets