initial commit

This commit is contained in:
2025-06-03 16:22:21 +02:00
commit cb38373bd0
23 changed files with 3163 additions and 0 deletions

136
tap-personio/.gitignore vendored Normal file
View File

@@ -0,0 +1,136 @@
# Secrets and internal config files
**/.secrets/*
# Ignore meltano internal cache and sqlite systemdb
.meltano/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/

View File

@@ -0,0 +1,47 @@
ci:
autofix_prs: true
autoupdate_schedule: weekly
autoupdate_commit_msg: 'chore: pre-commit autoupdate'
skip:
- uv-lock
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-json
exclude: |
(?x)^(
\.vscode/.*\.json
)$
- id: check-toml
- id: check-yaml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.33.0
hooks:
- id: check-dependabot
- id: check-github-workflows
- id: check-meltano
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.11
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix, --show-fixes]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.15.0
hooks:
- id: mypy
additional_dependencies:
- types-requests
- repo: https://github.com/astral-sh/uv-pre-commit
rev: 0.7.8
hooks:
- id: uv-lock
- id: uv-sync

10
tap-personio/.secrets/.gitignore vendored Normal file
View File

@@ -0,0 +1,10 @@
# IMPORTANT! This folder is hidden from git - if you need to store config files or other secrets,
# make sure those are never staged for commit into your git repo. You can store them here or another
# secure location.
#
# Note: This may be redundant with the global .gitignore for, and is provided
# for redundancy. If the `.secrets` folder is not needed, you may delete it
# from the project.
*
!.gitignore

20
tap-personio/.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,20 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "tap-personio",
"type": "debugpy",
"request": "launch",
"cwd": "${workspaceFolder}",
"program": "tap_personio",
"justMyCode": false,
"args": [
"--config",
".secrets/config.json",
],
},
]
}

136
tap-personio/README.md Normal file
View File

@@ -0,0 +1,136 @@
# tap-personio
`tap-personio` is a Singer tap for Personio.
Built with the [Meltano Tap SDK](https://sdk.meltano.com) for Singer Taps.
<!--
Developer TODO: Update the below as needed to correctly describe the install procedure. For instance, if you do not have a PyPI repo, or if you want users to directly install from your git repo, you can modify this step as appropriate.
## Installation
Install from PyPI:
```bash
pipx install tap-personio
```
Install from GitHub:
```bash
pipx install git+https://github.com/ORG_NAME/tap-personio.git@main
```
-->
## Configuration
### Accepted Config Options
<!--
Developer TODO: Provide a list of config options accepted by the tap.
This section can be created by copy-pasting the CLI output from:
```
tap-personio --about --format=markdown
```
-->
A full list of supported settings and capabilities for this
tap is available by running:
```bash
tap-personio --about
```
### Configure using environment variables
This Singer tap will automatically import any environment variables within the working directory's
`.env` if the `--config=ENV` is provided, such that config values will be considered if a matching
environment variable is set either in the terminal context or in the `.env` file.
### Source Authentication and Authorization
<!--
Developer TODO: If your tap requires special access on the source system, or any special authentication requirements, provide those here.
-->
## Usage
You can easily run `tap-personio` by itself or in a pipeline using [Meltano](https://meltano.com/).
### Executing the Tap Directly
```bash
tap-personio --version
tap-personio --help
tap-personio --config CONFIG --discover > ./catalog.json
```
## Developer Resources
Follow these instructions to contribute to this project.
### Initialize your Development Environment
Prerequisites:
- Python 3.9+
- [uv](https://docs.astral.sh/uv/)
```bash
uv sync
```
### Create and Run Tests
Create tests within the `tests` subfolder and
then run:
```bash
uv run pytest
```
You can also test the `tap-personio` CLI interface directly using `uv run`:
```bash
uv run tap-personio --help
```
### Testing with [Meltano](https://www.meltano.com)
_**Note:** This tap will work in any Singer environment and does not require Meltano.
Examples here are for convenience and to streamline end-to-end orchestration scenarios._
<!--
Developer TODO:
Your project comes with a custom `meltano.yml` project file already created. Open the `meltano.yml` and follow any "TODO" items listed in
the file.
-->
Next, install Meltano (if you haven't already) and any needed plugins:
```bash
# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-personio
meltano install
```
Now you can test and orchestrate using Meltano:
```bash
# Test invocation:
meltano invoke tap-personio --version
# OR run a test ELT pipeline:
meltano run tap-personio target-jsonl
```
### SDK Dev Guide
See the [dev guide](https://sdk.meltano.com/en/latest/dev_guide.html) for more instructions on how to use the SDK to
develop your own taps and targets.

49
tap-personio/meltano.yml Normal file
View File

@@ -0,0 +1,49 @@
version: 1
send_anonymous_usage_stats: true
project_id: "tap-personio"
default_environment: test
venv:
backend: uv
environments:
- name: test
plugins:
extractors:
- name: "tap-personio"
namespace: "tap_personio"
pip_url: -e .
capabilities:
- state
- catalog
- discover
- about
- stream-maps
# TODO: Declare settings and their types here:
settings:
- name: username
label: Username
description: The username to use for authentication
- name: password
kind: password
label: Password
description: The password to use for authentication
sensitive: true
- name: start_date
kind: date_iso8601
label: Start Date
description: Initial date to start extracting data from
# TODO: Declare required settings here:
settings_group_validation:
- [username, password]
# TODO: Declare default configuration values here:
config:
start_date: '2010-01-01T00:00:00Z'
loaders:
- name: target-jsonl
variant: andyh1203
pip_url: target-jsonl

4
tap-personio/output/.gitignore vendored Normal file
View File

@@ -0,0 +1,4 @@
# This directory is used as a target by target-jsonl, so ignore all files
*
!.gitignore

View File

@@ -0,0 +1,72 @@
[project]
name = "tap-personio"
version = "0.0.1"
description = "Singer tap for Personio, built with the Meltano Singer SDK."
readme = "README.md"
authors = [{ name = "Jeroen Vandensteen", email = "jeroen@hrlakehouse.com" }]
keywords = [
"ELT",
"Personio",
]
classifiers = [
"Intended Audience :: Developers",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
]
license-files = [ "LICENSE" ]
requires-python = ">=3.9"
dependencies = [
"singer-sdk~=0.46.4",
"requests~=2.32.3",
]
[project.optional-dependencies]
s3 = [
"s3fs~=2025.5.0",
]
[project.scripts]
# CLI declaration
tap-personio = 'tap_personio.tap:TapPersonio.cli'
[dependency-groups]
dev = [
{ include-group = "test" },
]
test = [
"pytest>=8",
"singer-sdk[testing]",
]
[tool.pytest.ini_options]
addopts = [
"--durations=10",
]
[tool.mypy]
warn_unused_configs = true
[tool.ruff]
target-version = "py39"
[tool.ruff.lint]
ignore = [
"COM812", # missing-trailing-comma
]
select = ["ALL"]
[tool.ruff.lint.flake8-annotations]
allow-star-arg-any = true
[tool.ruff.lint.pydocstyle]
convention = "google"
[build-system]
requires = [
"hatchling>=1,<2",
]
build-backend = "hatchling.build"

View File

@@ -0,0 +1 @@
"""Tap for Personio."""

View File

@@ -0,0 +1,7 @@
"""Personio entry point."""
from __future__ import annotations
from tap_personio.tap import TapPersonio
TapPersonio.cli()

View File

@@ -0,0 +1,44 @@
"""Personio Authentication."""
from __future__ import annotations
from singer_sdk.authenticators import OAuthAuthenticator, SingletonMeta
# The SingletonMeta metaclass makes your streams reuse the same authenticator instance.
# If this behaviour interferes with your use-case, you can remove the metaclass.
class PersonioAuthenticator(OAuthAuthenticator, metaclass=SingletonMeta):
"""Authenticator class for Personio."""
@property
def oauth_request_body(self) -> dict:
"""Define the OAuth request body for the AutomaticTestTap API.
Returns:
A dict with the request body
"""
# TODO: Define the request body needed for the API.
return {
"resource": "https://analysis.windows.net/powerbi/api",
"scope": self.oauth_scopes,
"client_id": self.config["client_id"],
"username": self.config["username"],
"password": self.config["password"],
"grant_type": "password",
}
@classmethod
def create_for_stream(cls, stream) -> PersonioAuthenticator: # noqa: ANN001
"""Instantiate an authenticator for a specific Singer stream.
Args:
stream: The Singer stream instance.
Returns:
A new authenticator.
"""
return cls(
stream=stream,
auth_endpoint="TODO: OAuth Endpoint URL",
oauth_scopes="TODO: OAuth Scopes",
)

View File

@@ -0,0 +1,174 @@
"""REST client handling, including PersonioStream base class."""
from __future__ import annotations
import decimal
import typing as t
from functools import cached_property
from importlib import resources
from singer_sdk.helpers.jsonpath import extract_jsonpath
from singer_sdk.pagination import BaseAPIPaginator # noqa: TC002
from singer_sdk.streams import RESTStream
from tap_personio.auth import PersonioAuthenticator
from singer_sdk.pagination import BaseHATEOASPaginator, first
if t.TYPE_CHECKING:
import requests
from singer_sdk.helpers.types import Auth, Context
# TODO: Delete this is if not using json files for schema definition
SCHEMAS_DIR = resources.files(__package__) / "schemas"
class MyPaginator(BaseHATEOASPaginator):
def get_next_url(self, response):
try:
return first(
extract_jsonpath("$._meta.links.next.href", response.json())
)
except StopIteration:
return None
class PersonioStream(RESTStream):
"""Personio stream class."""
# Limit the number of results per page
# Max 50 according to Personio API documentation.
RESULTS_PER_PAGE = 50
NEXT_PAGE_JSONPATH = "$.cursor"
# Update this value if necessary or override `parse_response`.
records_jsonpath = "$._data[*]"
@property
def url_base(self) -> str:
"""Return the API URL root, configurable via tap settings."""
# TODO: hardcode a value here, or retrieve it from self.config
return "https://api.personio.de/v1"
@cached_property
def authenticator(self) -> Auth:
"""Return a new authenticator object.
Returns:
An authenticator instance.
"""
return PersonioAuthenticator.create_for_stream(self)
@property
def http_headers(self) -> dict:
"""Return the http headers needed.
Returns:
A dictionary of HTTP headers.
"""
return {
'X-Personio-Partner-ID': self.config.get("partner_id", ""),
'X-Personio-App-ID': self.config.get("app_id", ""),
}
def get_new_paginator(self) -> BaseAPIPaginator:
"""Create a new pagination helper instance.
If the source API can make use of the `next_page_token_jsonpath`
attribute, or it contains a `X-Next-Page` header in the response
then you can remove this method.
If you need custom pagination that uses page numbers, "next" links, or
other approaches, please read the guide: https://sdk.meltano.com/en/v0.25.0/guides/pagination-classes.html.
Returns:
A pagination helper instance.
"""
return MyPaginator()
def get_url_params(
self,
context: Context | None, # noqa: ARG002
next_page_token: t.Any | None, # noqa: ANN401
) -> dict[str, t.Any]:
"""Return a dictionary of values to be used in URL parameterization.
Args:
context: The stream context.
next_page_token: The next page index or value.
Returns:
A dictionary of URL query parameters.
"""
params: dict = {}
# Next page token is a URL, so we can to parse it to extract the query string
if next_page_token:
params.update(parse_qsl(next_page_token.query))
# Set the results limit
params["limit"] = self.RESULTS_PER_PAGE
# No sorting support for Personio API, so commented this out.
#if self.replication_key:
# params["sort"] = "asc"
# params["order_by"] = self.replication_key
return params
def prepare_request_payload(
self,
context: Context | None, # noqa: ARG002
next_page_token: t.Any | None, # noqa: ARG002, ANN401
) -> dict | None:
"""Prepare the data payload for the REST API request.
By default, no payload will be sent (return None).
Args:
context: The stream context.
next_page_token: The next page index or value.
Returns:
A dictionary with the JSON body for a POST requests.
"""
# TODO: Delete this method if no payload is required. (Most REST APIs.)
return None
def parse_response(self, response: requests.Response) -> t.Iterable[dict]:
"""Parse the response and return an iterator of result records.
Args:
response: The HTTP ``requests.Response`` object.
Yields:
Each record from the source.
"""
# TODO: Parse response body and return a set of records.
yield from extract_jsonpath(
self.records_jsonpath,
input=response.json(parse_float=decimal.Decimal),
)
def post_process(
self,
row: dict,
context: Context | None = None, # noqa: ARG002
) -> dict | None:
"""As needed, append or transform raw data to match expected structure.
Args:
row: An individual record from the stream.
context: The stream context.
Returns:
The updated record dictionary, or ``None`` to skip the record.
"""
# TODO: Delete this method if not needed.
return row

View File

@@ -0,0 +1 @@
"""JSON schema files for the REST API."""

View File

@@ -0,0 +1,156 @@
{
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
},
"status": {
"type": ["string", "null"],
"enum": ["ACTIVE", "INACTIVE", "ONBOARDING", "LEAVE", "UNSPECIFIED"]
},
"weekly_working_hours": {
"type": ["integer", "null"]
},
"full_time_weekly_working_hours": {
"type": ["integer", "null"]
},
"probation_end_date": {
"type": ["string", "null"],
"format": "date"
},
"employment_start_date": {
"type": ["string", "null"],
"format": "date"
},
"employment_end_date": {
"type": ["string", "null"],
"format": "date"
},
"type": {
"type": ["string", "null"],
"enum": ["INTERNAL", "EXTERNAL", "UNSPECIFIED"]
},
"contract_end_date": {
"type": ["string", "null"],
"format": "date"
},
"created_at": {
"type": ["string", "null"],
"format": "date-time"
},
"updated_at": {
"type": ["string", "null"],
"format": "date-time"
},
"position": {
"type": "object",
"properties": {
"title": {
"type": ["string", "null"]
}
}
},
"supervisor": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
}
}
},
"office": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
}
}
},
"legal_entity": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"],
"description": "Identifier of the legal entity"
}
},
"required": ["id"],
"additionalProperties": false
},
"org_units": {
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": ["string", "null"]
},
"id": {
"type": ["string", "null"]
}
}
}
},
"cost_centers": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
},
"weight": {
"type": ["integer", "null"]
}
}
}
},
"person": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
}
}
},
"termination": {
"type": ["object", "null"],
"properties": {
"termination_date": {
"type": ["string", "null"],
"format": "date"
},
"last_working_date": {
"type": ["string", "null"],
"format": "date"
},
"terminated_at": {
"type": ["string", "null"],
"format": "date"
},
"type": {
"type": ["string", "null"],
"enum": [
"UNSPECIFIED",
"EMPLOYEE",
"FIRED",
"DEATH",
"CONTRACT_EXPIRED",
"AGREEMENT",
"SUB_COMPANY_SWITCH",
"IRREVOCABLE_SUSPENSION",
"CANCELLATION",
"COLLECTIVE_AGREEMENT",
"SETTLEMENT_AGREEMENT",
"RETIREMENT",
"COURT_SETTLEMENT",
"QUIT"
]
},
"reason": {
"type": ["string", "null"]
}
}
}
}
}

View File

@@ -0,0 +1,203 @@
{
"type": "object",
"properties": {
"id": {
"type": "string"
},
"status": {
"type": "string",
"enum": ["ACTIVE", "INACTIVE", "SUSPENDED", "DELETED"]
},
"is_main": {
"type": "boolean"
},
"valid_from": {
"type": "string",
"format": "date",
"description": "Date from which this record is valid"
},
"assigned_employees": {
"type": "object",
"properties": {
"active": {
"type": "integer",
"minimum": 0,
"description": "Number of active employees"
},
"total": {
"type": "integer",
"minimum": 0,
"description": "Total number of employees"
}
},
"required": ["active", "total"],
"additionalProperties": false
},
"country": {
"type": "string",
"pattern": "^[A-Z]{2}$",
"description": "Country code in ISO 3166-1 alpha-2 format"
},
"name": {
"type": "string",
"description": "Legal name of the company"
},
"type": {
"type": "string",
"enum": ["GMBH", "AG", "UG", "KG", "OHG", "EV", "VV"],
"description": "Legal form of the company"
},
"registration_number": {
"type": "string",
"description": "Official registration number of the company"
},
"industry_sector": {
"type": "string",
"enum": [
"COMPUTER_SOFTWARE",
"MANUFACTURING",
"FINANCIAL_SERVICES",
"HEALTHCARE",
"RETAIL",
"TELECOMMUNICATIONS",
"CONSTRUCTION",
"EDUCATION"
],
"description": "Primary industry sector of the company"
},
"email": {
"type": "string",
"format": "email",
"description": "Primary email address of the company"
},
"phone": {
"type": "string",
"description": "Primary phone number of the company"
},
"address": {
"type": "object",
"required": [
"street_name",
"house_number",
"postal_code",
"city"
],
"properties": {
"street_name": {
"type": "string",
"description": "Name of the street"
},
"house_number": {
"type": "string",
"description": "House/building number"
},
"postal_code": {
"type": "string",
"description": "Postal/ZIP code"
},
"city": {
"type": "string",
"description": "City name"
},
"state": {
"type": "string",
"description": "State or region code"
}
},
"additionalProperties": false
},
"contact_person": {
"type": "object",
"properties": {
"salutation": {
"type": "string",
"enum": ["MR", "MRS", "MS", "DR", "PROF"],
"description": "Salutation of the contact person"
},
"full_name": {
"type": "string",
"description": "Full name of the contact person"
},
"email": {
"type": "string",
"format": "email",
"description": "Email address of the contact person"
},
"phone": {
"type": "string",
"description": "Phone number of the contact person"
},
"fax": {
"type": "string",
"description": "Fax number of the contact person"
}
},
"required": ["full_name"],
"additionalProperties": false
},
"bank_details": {
"type": "object",
"properties": {
"iban": {
"type": "string",
"pattern": "^[A-Z]{2}[0-9]{2}[a-zA-Z0-9]{1,30}$",
"description": "International Bank Account Number"
},
"bic": {
"type": "string",
"pattern": "^[A-Z]{6}[A-Z0-9]{2}([A-Z0-9]{3})?$",
"description": "Bank Identifier Code"
},
"account_holder": {
"type": "string",
"description": "Name of the account holder"
}
},
"required": ["iban", "bic", "account_holder"],
"additionalProperties": false
},
"mailing_address": {
"type": "object",
"required": [
"street_name",
"house_number",
"postal_code",
"city",
"address_type"
],
"properties": {
"street_name": {
"type": "string",
"description": "Name of the street"
},
"house_number": {
"type": "string",
"description": "House/building number"
},
"postal_code": {
"type": "string",
"description": "Postal/ZIP code"
},
"city": {
"type": "string",
"description": "City name"
},
"name": {
"type": "string",
"description": "Name for this address (e.g., 'Headquarters')"
},
"additional_info": {
"type": "string",
"description": "Additional address information"
},
"address_type": {
"type": "string",
"enum": ["REGULAR_ADDRESS", "PO_BOX", "PACKSTATION"],
"description": "Type of mailing address"
}
},
"additionalProperties": false
}
},
"additionalProperties": false
}

View File

@@ -0,0 +1,141 @@
{
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
},
"status": {
"type": ["string", "null"],
"enum": ["ACTIVE", "OUT_OF_BUSINESS"]
},
"is_main": {
"type": ["boolean", "null"]
},
"valid_from": {
"type": ["string", "null"],
"format": "date"
},
"assigned_employees": {
"type": "object",
"properties": {
"active": {
"type": "integer"
},
"total": {
"type": "integer"
}
}
},
"country": {
"type": ["string", "null"],
"pattern": "^[A-Z]{2}$"
},
"name": {
"type": ["string", "null"]
},
"type": {
"type": ["string", "null"]
},
"registration_number": {
"type": ["string", "null"]
},
"industry_sector": {
"type": ["string", "null"],
"enum": ["ACCOUNTING_AND_AUDITING_SERVICES", "ADVERTISING_AND_PR_SERVICES", "AEROSPACE", "AGRICULTURE", "ARCHITECTURAL", "AUTOMOTIVE", "AUTOMOTIVE_PARTS", "BANKING", "BIOTECHNOLOGY", "BROADCASTING", "BUSINESS_SERVICES", "CHEMICALS", "CLOTHING", "COMPUTER_HARDWARE", "COMPUTER_SERVICES", "COMPUTER_SOFTWARE", "CONSTRUCTION", "CONSTRUCTION_RESIDENTIAL", "EDUCATION", "ELECTRONICS", "ENERGY", "ENGINEERING_SERVICES", "ENTERTAINMENT_VENUES", "FINANCIAL_SERVICES", "FOOD_AND_BEVERAGE", "GOVERNMENT", "HEALTHCARE_SERVICES", "HOTELS", "INSURANCE", "INTERNET", "LEGAL_SERVICES", "MANAGEMENT", "MANUFACTURING", "MARINE_MFG", "MEDICAL_DEVICES", "METALS", "NONPROFIT", "OTHER", "PERFORMING", "PERSONAL", "PERSONAL_CARE", "PRINTING", "REAL_ESTATE", "RENTAL_SERVICES", "RESTAURANT", "RETAIL", "SECURITY", "SPORTS", "STAFFING", "TELECOMMUNICATIONS", "TRADE_EXPORT", "TRANSPORT", "TRAVEL", "WASTE_MANAGEMENT"]
},
"email": {
"type": ["string", "null"]
},
"phone": {
"type": ["string", "null"]
},
"address": {
"type": "object",
"properties": {
"street_name": {
"type": ["string", "null"]
},
"house_number": {
"type": ["string", "null"]
},
"postal_code": {
"type": ["string", "null"]
},
"city": {
"type": ["string", "null"]
},
"state": {
"type": ["string", "null"]
}
}
},
"contact_person": {
"type": "object",
"properties": {
"salutation": {
"type": ["string", "null"],
"enum": ["MR", "MRS", "MS", "DR", "PROF"]
},
"full_name": {
"type": ["string", "null"]
},
"email": {
"type": ["string", "null"]
},
"phone": {
"type": ["string", "null"]
},
"fax": {
"type": ["string", "null"]
}
}
},
"bank_details": {
"type": "object",
"properties": {
"iban": {
"type": ["string", "null"]
},
"bic": {
"type": ["string", "null"]
},
"account_holder": {
"type": ["string", "null"]
}
}
},
"mailing_address": {
"type": "object",
"properties": {
"street_name": {
"type": ["string", "null"]
},
"house_number": {
"type": ["string", "null"]
},
"postal_code": {
"type": ["string", "null"]
},
"city": {
"type": ["string", "null"]
},
"name": {
"type": ["string", "null"]
},
"additional_info": {
"type": ["string", "null"]
},
"address_type": {
"type": ["string", "null"],
"enum": ["REGULAR_ADDRESS", "PO_BOX", "EPOST", "FOREIGN_ADDRESS"]
},
"country": {
"type": ["string", "null"]
},
"po_box_number": {
"type": ["string", "null"]
}
}
}
}
}

View File

@@ -0,0 +1,55 @@
{
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
},
"email": {
"type": ["string", "null"]
},
"created_at": {
"type": ["string", "null"],
"format": "date-time"
},
"updated_at": {
"type": ["string", "null"],
"format": "date-time"
},
"first_name": {
"type": ["string", "null"]
},
"last_name": {
"type": ["string", "null"]
},
"preferred_name": {
"type": ["string", "null"]
},
"gender": {
"type": ["string", "null"],
"enum": ["UNSPECIFIED", "MALE", "FEMALE", "DIVERSE"]
},
"profile_picture": {
"type": "object",
"properties": {
"url": {
"type": ["string", "null"]
}
}
},
"status": {
"type": ["string", "null"],
"enum": ["ACTIVE", "INACTIVE", "UNSPECIFIED"]
},
"employments": {
"type": ["array", "null"],
"items": {
"type": "object",
"properties": {
"id": {
"type": ["string", "null"]
}
}
}
}
}
}

View File

@@ -0,0 +1,120 @@
"""Stream type classes for tap-personio."""
from __future__ import annotations
import typing as t
from importlib import resources
from singer_sdk import typing as th # JSON Schema typing helpers
from tap_personio.client import PersonioStream
from typing import List, Optional
SCHEMAS_DIR = resources.files(__package__) / "schemas"
class PersonsStream(PersonioStream):
name = "persons"
path = "/v2/persons"
primary_keys = ["id"]
replication_key = "updated_at"
is_sorted = False
schema_filepath = SCHEMAS_DIR / "persons.json"
def get_child_context(self, record: dict, context: Optional[dict]) -> dict:
"""Return a context dictionary for child streams."""
return {
"person_id": record["id"],
}
class EmploymentsStream(PersonioStream):
name = "employments"
path = "/v2/persons/{person_id}/employments"
primary_keys = ["id"]
replication_key = "updated_at"
is_sorted = False
schema_filepath = SCHEMAS_DIR / "employments.json"
# EmploymentsStream should be invoked once per parent Person
parent_stream_type = PersonsStream
# Assume employments don't have "updated_at" incremented when employments are changed
ignore_parent_replication_keys = True
# Getting the Org Units, based on the IDs received from the EmploymentsStream
# Also fetches the parent chain for each org unit, to ensure we have the full hierarchy
class OrgUnitStream(PersonioStream):
name = "org-units"
path = "/v2/org-units/{id}"
schema_filepath = SCHEMAS_DIR / "org-units.json"
primary_keys = ["id"]
def get_url_params(self, context, next_page_token):
"""Add the include_parent_chain parameter to all requests"""
params = super().get_url_params(context, next_page_token)
params["include_parent_chain"] = "true"
return params
def __init__(self, tap):
super().__init__(tap)
self._org_unit_ids = None
def get_records(self, context):
initial_org_unit_ids = self._get_required_org_unit_ids()
processed_ids = set()
for org_unit_id in initial_org_unit_ids:
try:
response_data = self._fetch_org_unit(org_unit_id)
# Extract and yield all org units from this response
yield from self._extract_org_units(response_data, processed_ids)
except Exception as e:
self.logger.error(f"Failed to fetch org unit {org_unit_id}: {e}")
continue
def _extract_org_units(self, response_data, processed_ids):
"""Extract all org units from a single API response"""
all_org_units = []
# Add the main org unit
main_org_unit = {k: v for k, v in response_data.items() if k != "parent_chain"}
all_org_units.append(main_org_unit)
# Add all parent org units
all_org_units.extend(response_data.get("parent_chain", []))
# Yield only unique org units
for org_unit in all_org_units:
org_unit_id = org_unit["id"]
if org_unit_id not in processed_ids:
processed_ids.add(org_unit_id)
yield org_unit
def _fetch_org_unit(self, org_unit_id):
url = self.get_url({"id": org_unit_id})
response = self.request_decorator(self._request)(url)
return response.json()
def _get_required_org_unit_ids(self):
# This could read from tap state, a file, or re-scan employment data
employment_stream = EmploymentsStream(self._tap)
org_unit_ids = set()
for record in employment_stream.get_records(None):
if record.get('org_unit_id'):
org_unit_ids.add(record['org_unit_id'])
return org_unit_ids
class LegalEntitiesStream(PersonioStream):
name = "legal-entities"
path = "/v2/legal-entities"
primary_keys = ["id"]
replication_key = "updated_at"
is_sorted = False
schema_filepath = SCHEMAS_DIR / "legal-entities.json"

View File

@@ -0,0 +1,65 @@
"""Personio tap class."""
from __future__ import annotations
from singer_sdk import Tap
from singer_sdk import typing as th # JSON schema typing helpers
# TODO: Import your custom stream types here:
from tap_personio import streams
class TapPersonio(Tap):
"""Personio tap class."""
name = "tap-personio"
config_jsonschema = th.PropertiesList(
th.Property(
"client_id",
th.StringType,
required=True,
secret=True,
description="The client id to authenticate against the Personio API",
),
th.Property(
"client_secret",
th.StringType,
required=True,
secret=True,
description="The client secret to authenticate against the Personio API",
),
th.Property(
"start_date",
th.DateTimeType,
description="The earliest record date to sync",
),
th.Property(
"partner_id",
th.StringType,
required=False,
default="",
description="The partner ID for the Personio API, if applicable",
),
th.Property(
"app_id",
th.StringType,
required=True,
default="",
description="The app ID for the Personio API",
),
).to_dict()
def discover_streams(self) -> list[streams.PersonioStream]:
"""Return a list of discovered streams.
Returns:
A list of discovered streams.
"""
return [
streams.EmployeesStream(self),
]
if __name__ == "__main__":
TapPersonio.cli()

View File

@@ -0,0 +1 @@
"""Test suite for tap-personio."""

View File

@@ -0,0 +1,22 @@
"""Tests standard tap features using the built-in SDK tests library."""
import datetime
from singer_sdk.testing import get_tap_test_class
from tap_personio.tap import TapPersonio
SAMPLE_CONFIG = {
"start_date": datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%d"),
# TODO: Initialize minimal tap config
}
# Run standard built-in tap tests from the SDK:
TestTapPersonio = get_tap_test_class(
tap_class=TapPersonio,
config=SAMPLE_CONFIG,
)
# TODO: Create additional tests as appropriate for your tap.

15
tap-personio/tox.ini Normal file
View File

@@ -0,0 +1,15 @@
# This file can be used to customize tox tests as well as other test frameworks like flake8 and mypy
[tox]
envlist = py3{9,10,11,12,13}
minversion = 4.22
requires =
tox>=4.22
[testenv]
pass_env =
TAP_PERSONIO_*
dependency_groups =
test
commands =
pytest {posargs}

1684
tap-personio/uv.lock generated Normal file

File diff suppressed because it is too large Load Diff