A Blockchain and A.I First Innovation Hub.

3rd Floor, Plaza 2000, Mombasa Rd.
+254 716 490 808
[email protected]
+254 716 490 808
3rd Floor, Plaza 2000, Mombasa Rd
[email protected]

Next Hackathon Coming Soon!

Use landscape mode to view all table data from your mobile phone.

This project entails the successful extraction of information from the Kenya Industrial Property Institute (KIPI) journals which are Industrial Property Journals used as a procedure for protection industrial property rights under the Industrial Property Act, No. of 2001 and the Trade Marks Act, Cap 506 of the Laws of Kenya. The journals are published monthly. Like most governmental or parastatal publications, the KIPI journals are published in PDF format. The journal contains details of all the patents, trademarks and industrial designs applied for the previous month and it includes details specified by World Intellectual Property Organization (WIPO) through its internationally accepted WIPO standards.

The project entails extracting data from the KIPI publications and storing it in tabular structure. Currently the data exists as a free-flowing text. For each patent, utility design, and industrial design each of the following data points should be captured (as shown in Table 1 below). The INID_CODE represents the Internationally agreed Number for Identification of (bibliographic) Data. The codes represent the fields we wish to capture for each data point.


(11)DOCUMENT_NUMBERNumber of document
(12)DOCUMENT_TYPEDesignation of the kind of document
(15)CORRECTED_INFO_PUICorrection information
(19)PUB_ORGCode of the office or organization publishing the document
(21)APPLICATION_NUMBER_PUIApplication number
(22)APPLICATION_DATE_PUIDate of filling of application
(30)PRIORITY_DATAPriority data
(31)PRIORITY_NUMBERPriority number(s)
(32)PRIORITY_DATEPriority Date(s)
(33)PRIORITY_COUNTRYPriority Country(s)
(41)PUB_DATE_UNEXAMINEDPublication date of unexamined application
(42)PUB_DATE_EXAMINEDPublication date of examined application
(45)REGISTRATION_DATE_PUIPublication date of registered right
(51)INTER_CLASS_PUIInternational classification
(54)TITLE_INVENTIONTitle of invention
(56)CITED_DOCSCited documents
(71)APPLICANT_NAME_PUIName(s) of application(s)
(72)INVENTOR_NAMEName of inventor(s)
(740)NAME_ADDRESS_AGENT_PUIName and address of Agent or address of service
(85)DATE_PCTDate of entry into national phase for PCT application
(87)PCT_DATAPCT Publication data


For trademarks, the table below (Table 2) will constitute the data points we wish to capture.


180EXPIRY_DATEExpiry date
210APPLICATION_NUMBER_MARKApplication number
220APPLICATION_DATE_MARKFilling date of application
300PARIS_CONVENTIONData relating to application under the Paris convention
442ADVERT_DATEDate of advertisement
511INTER_CLASS_MARKInternational classification
540MARK_REPRODUCTIONReproduction of the Mark
566MARK_TRANSLATIONTranslation of mark or words contained in the mark
15CORRECTED_INFO_MARKAmmendment/ Correction information
591INFO_COLOURSInformation concerning colours claimed
730NAME_ADDRESS_PROPRIETOR_MARKName and address of the /proprietor
740NAME_ADDRESS_APPLICANT_MARKName aand address of the agent or applicant
791NAME_ADDRESS_LICENSEE_MARKName and address of Licensee/registered user
  1. All the participants shall use a different PDF documents that shall be provided as part of the project
  2. The completed code and file used shall be uploaded in the participants Github account
  3. All participants must stick to the stipulated timelines for the project

Submission Criteria

Submit Create a Github repo with the NLP code. Include the dataset (PDF) used as input. The code must dump a CSV file with the fields aforementioned above. Indicate the file used for the analysis. The metric to measure will be on the completion rates of the fields.

  1. Successful extraction of the information from the KIPI journals and organization of the same as tabular format in CSV format
  2. Completed code and files used shall be in the participants Github account to be evaluated upon completion or when the set timeline lapses
  3. The metrics for success will be the number of fields successfully populated by the participant’s code
  4. Success shall also be measured on the ability of the code to populate a CSV file with the fields as columns

1st – Ksh. 50,000

2nd – Ksh. 30,000

3rd – Ksh. 20,000

2 Weeks

You have been provided with a KIPI pdf journal that needs an algorithm developed to clean up and populate the data into an excel sheet. See details in the Hackathon Description.

Hackathon Sponsor