HPDBSCAN Benchmark test files
Four data sets used for the benchmark of HPDBSCAN. Collection of geo-tagged Twitter data: This set has been collected and made available to us by Junjun Yin form the National... -
Cinderella - tool for Clustering and Classifications of Texts in Polish
System for clustering and classifications of Texts in Polish. Source code. -
The news articles reporting on the 2021 Tokyo Olympics data set OG2021 (resea...
The OG2021 corpus contains multilingual news articles that are reporting on the events happening during the 2021 Tokyo Olympics. The data set was created to evaluate the... -
The Orange workflow for observing collocation clusters ColEmbed 1.0
The Orange Workflow for Observing Collocation Clusters ColEmbed 1.0 ColEmbed is a workflow (.OWS file) for Orange Data Mining (an open-source machine learning and data... -
The news articles reporting on the 2021 Tokyo Olympics data set OG2021 (public)
The OG2021 corpus contains multilingual news articles that are reporting on the events happening during the 2021 Tokyo Olympics. The data set was created to evaluate the... -
VPS-30-En is a small lexical resource that contains the following 30 English verbs: access, ally, arrive, breathe, claim, cool, crush, cry, deny, enlarge, enlist, forge,... -
VPS-GradeUp (2016-10-10)
VPS-GradeUp is a collection of triple manual annotations of 29 English verbs based on the Pattern Dictionary of English Verbs (PDEV) and comprising the following lemmas:... -
MMI_clustering is a set of command line tools implementing Mercer's maximum mutual information-based clustering technique. -
Data for: On Inter-bubble Distances and Bubble Clustering in Bubbly Flows: An...
This data set contains the processed data from ultrafast X-ray tomography measurements in a bubble column. Measurements were performed in a bubble column with 100 mm inner... -
This entry contains sets of synthetic datasets to be used for benchmarking clustering algorithms for mixed (continuous and categorical) data. The synthetic datasets correspond... -
Descriptive variables of 11,004 roots from 69 Pinus pinaster root systems exc...
This data set include 11004 roots from 69 coarse root systems of Pinus pinaster trees. These trees belong from the same local provenance of P. pinaster trees germinated in the... -
Approximating grouped fixed effects estimation via fuzzy clustering regressio...
We propose a new, computationally efficient way to approximate the “grouped fixed effects” (GFE) estimator of Bonhomme and Manresa (2015), which estimates grouped patterns of... -
Studies of local order and clustering in (1-x)BaTiO3-xBiYbO3 Perovskites
We propose total scattering experiments on samples of (1-x)BaTiO3-xBiYbO3 perovskites with x = 0, 0.04, 0.08 and 0.15 to resolve issues over the local structure and degree of... -
Oxide Ionic Conductivity in Anion-Deficient Fluorites
The last couple of decades have seen a rapidly increasing interest in new environment friendly energy sources and one key aspect of this research involves characterisation of... -
Replication Data for: Transcriptomic-based clustering of human atheroscleroti...
Background These data are used in this paper by Mokry et al. Abstract Histopathological studies have revealed key processes of atherosclerotic plaque thrombosis. However, the... -
Data for: On Inter-bubble Distances and Bubble Clustering in Bubbly Flows: An...
This data set contains the processed data from ultrafast X-ray tomography measurements in a bubble column. Measurements were performed in a bubble column with 100 mm inner... -
Fatbox - Fault Analysis Toolbox
Fatbox - Fault Analysis Toolbox is a python module for the extraction and analysis of faults (and fractures) in raster data. We often observer faults in 2-D or 3-D raster data...