Library2Notion (v1.1.1): create your own digital library in Notion with Python

Helguera
Written by Helguera on
Link (Actualizado: )

Changelog


  • Added -i (--Ignore) flag to exclude folders from being processed.
  • Added -c (--Config) flag to be able to provide a config json file which includes pre-configured parameters. The tool will create this file for you in the first execution.

  • The metadata will only be extracted from the books where it is needed, reducing considerably the execution time.

  • -only_new: Only check new books.
  • -only_updated: Only check changes in existing books (does not include removals).
  • -only_deleted: Only check deleted books

Full rework of the code:
  • Added direct API calls to Notion.
  • No need of csv files
  • No need of token_v2
  • Reduces complexity
  • Now it can delete entries from Notion
  • Idempotent executions

  • Added support for physical books
  • Fixed tool crashing when log file did not exist

  • First working version
  • Uploaded to PyPi

Introduction

My digital library has lately grown really fast, containing at this moment over 600 books. It was impossible for me to keep track of all the books I have.

That's why I decided to create a simple tool to upload all books (just metadata, not the book itself) to a database in Notion. Thanks to that, I would be able to assign statuses, priorities, tags, filter by a property or author and add comments to the books.

The tool makes use of the Notion API and has the following main features:

  • Detects all digital books from a given path and all the subfolders.
  • Adds, updates or deletes entries on Notion DB to match the status of the local folder.
  • Possibility to create a simple .paper file to add non-digital to be tracked as well.
  • Extracts metadata from .pdf and .epub files.

Installation



brew install helguera/tap/library2notion

pipx install library2notion

With PIP


pip install --user library2notion

Nota: Python 3.7 or later required.

From source

This project uses poetry for dependency management and packaging. We will have to install it first. See official documentation for instructions.

git clone https://github.com/helguera/library2notion.git
cd library2notion/
poetry install 
poetry run library2notion

Usage


$ library2notion --help
usage: library2notion [-h] -p PATH -t NOTIONTOKEN -d NOTIONDBID [-f FORMATS [FORMATS ...]] [--only_new | --only_updated | --only_deleted]

library2notion created by Javier Helguera (github.com/helguera) © 2023 MIT License

general options:
  -p, --Path PATH                    Path where to start looking for books. It will also check all subfolders
  -t, --NotionToken NOTIONTOKEN      Notion token
  -d, --NotionDbId  NOTIONDBURL      Notion database id
  -f, --Formats     FORMATS          List of formats to be taken into account. At this moment .PDF, .EPUB and .PAPER are supported
  -c, --Config      CONFIG           Provide a config json which includes the parameters
  -i, --Ignore      IGNORE           Folders to ignore (in a list)  
  -only_new,                         Only check new books
  -only_updated                      Only check changes in existing books (does not include removals)
  -only_deleted                      Only check deleted books
  -h, --help                         Show this help message and exit

Input


-p, --Path

The path where to look for files (and subfolders). It is higly recommended to use relative paths. This is really important since it will also be used to generate the tags of each book. (Take a look at "Metadata" section for further info).

For example, if your library looks like:

/home/your_user/Documents/books/Programming/Python
/home/your_user/Documents/books/Programming/C
/home/your_user/Documents/books/History/Spanish History

They ideal way to proceed is, first, move to books folder, since it is the common one for all books.

cd /home/your_user/Documents/books

And from here, execute l2n with a relative path:

library2notion -p "./"

⚠️Important: They way a book is uniquely identified is by using its path. This is crucial because the book ./books/Programming/Python/PythonCookbook will be treated as a different one from ./Programming/Python/PythonCookbook.

-t, --NotionToken

It is the secret token from Notion when an itegration is created. Visit the official docs for further info. Don't forget to give your integration page permissions.

You can also take a look at this post that I have created that explains all you need to know to create a Notion integration and connect it to your app.

-d, --NotionDbId

The id of the Notion database where all info will be uploaded. This database has to exist in advance and the columns it needs to have are fixed and can't be changed. These are:

Column Type
File Name Title
Title Text
Priority Select
Status Select
Format Multi-select
Tags Multi-select
Comments Text
Author Text
Publisher Text
ISBN Text

To get the database ID value, open the database as a full page in Notion. Use the Share menu to Copy link. Now paste the link in your text editor so you can take a closer look. The URL uses the following format:

https://www.notion.so/{workspace_name}/{database_id}?v={view_id}

-f, --Formats

These are the formats that will be taken into account. At this moment, .epub, .pdf and .paper are supported and used by default.

library2notion -f EPUB PAPER  -> only look for .epub and .paper files
library2notion -f PDF         -> only look for .pdf files

-c, --Config

A config.json file can be provided as input with the settings already in it. The tool will ask if you want it to create this file for you the first time it is executed, so future executions will be easier.

{
    "notion_secret_token": "",
    "notion_db_id": "",
    "path": "",
    "ignore": []
}

-i, --Ignore

This allows you to exlude some folders or subfolders that you don't want to include. For example:

-i History "./Tech Books/Programming" 

The folder "History" will be completely ignored. Same for "Tech Books/Programming", but not for the rest of the books in "Tech Books".

--only_new

Checks only new books that have been added to the path since the last execution. Skips updated and deleted ones.

--only_updated

Checks only books that have been updated since the last execution. Skips new and deleted ones.

--only_deleted

Checks only books that have been deleted since the last execution. Skips new and updated ones.

Metadata

The tool will extract the following data to upload to Notion:

  • File Name: is the full path to the file. It is used as primary key of the table in Notion, so it can't be duplicated.
  • Title: title of the book.
  • Tags: the categories of the book. They are generated automatically from the path. For example, if the path is ./Tech Books/Programming/Python/mybook.pdf, the tags will be Tech Books, Programming, Python.
  • Author: the author or authors of the book.
  • Publisher: the publisher of the book.
  • Formats: the available formats of the book. A book available in multiple formats will only appear once in the database.
  • ISBN: the ISBN.

Non-digital books (.paper)

With update 0.2.0, the tool supports non-digital books. You just have to create a .paper file in a folder per non-digital book that you want to add with the following content:

{
    "Title": "",
    "Author": "",
    "Publisher": "",
    "ISBN": ""
}

Log Files

A log file will be created after each execution in folder ./library2notion-logs. It will include info about created, updated and deleted books.

Examples

Let's suppose that our local library is located in /home/user/Documents/my-books.

cd /home/user/Documents/my-books

Normal execution

The normal execution will work in any scenario, but it takes more time to finish because more files have to be checked. That's why the flags --only_new, --only_updated and --only_deleted exist.

library2notion -p "./" -t NOTION_TOKEN -d NOTIONDB_ID

Some books have been added to the local folder

library2notion -p "./" -t NOTION_TOKEN -d NOTIONDB_ID --only_new

A new extension of an existing book have been added to the local folder

library2notion -p "./" -t NOTION_TOKEN -d NOTIONDB_ID --only_update

Some books have been deleted from the local folder

library2notion -p "./" -t NOTION_TOKEN -d NOTIONDB_ID --only_deleted

Take into account only PDF files

library2notion -p "./" -t NOTION_TOKEN -d NOTIONDB_ID -f PDF

Repository on GitHub

You can find the complete repository of the tool on my GitHub.

Help

If you need help with how to use the tool or have encountered a problem you can contact me at javier@javierhelguera. I hope you found this post useful.

Javier Helguera.

Comments