Dataset with four years of condition monitoring technical language annotations from paper machine industries in northern Sweden
This dataset consists of four years of technical language annotations from two paper machines in northern Sweden, structured as a Pandas dataframe. The same data is also available as a semicolon-separated .csv file. The data consists of two columns, where the first column corresponds to annotation note contents, and the second column corresponds to annotation titles. The annotations are in Swedish, and processed so that all mentions of personal information are replaced with the string ‘egennamn’, meaning “personal name” in Swedish. Each row corresponds to one annotation with the corresponding title. Data can be accessed in Python with: import pandas as pd annotations_df = pd.read_pickle("Technical_Language_Annotations.pkl") annotation_contents = annotations_df['noteComment'] annotation_titles = annotations_df['title']
Documentation files
Documentation files
Citation and access
Citation and access
Corpus
Corpus
Method and outcome
Method and outcome
Geographic coverage
Geographic coverage
Administrative information
Administrative information
Topic and keywords
Topic and keywords
Relations
Relations
Publications
Publications
Metadata
Metadata
