tra-analysis v 3.0.0 aggregate PR (#73)

* reflected doc changes to README.md

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* tra_analysis v 2.1.0-alpha.1

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* changed setup.py to use __version__ from source
added Topic and keywords

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* updated Supported Platforms in README.md

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* moved required files back to parent

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* moved security back to parent

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* moved security back to parent
moved contributing back to parent

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* add PR template

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* moved to parent folder

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* moved meta files to .github folder

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* Analysis.py v 3.0.1

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* updated test_analysis for submodules, and added missing numpy import in Sort.py

* fixed item one of Issue #58

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* readded cache searching in postCreateCommand

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* added myself as an author

* feat: created kivy gui boilerplate

* added Kivy to requirements.txt

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* feat: gui with placeholders

* fix: changed config.json path

* migrated docker base image to debian

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* style: spaces to tabs

* migrated to ubuntu

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* fixed issues

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* fix: docker build?

* fix: use ubuntu bionic

* fix: get kivy installed

* @ltcptgeneral can't spell

* optim dockerfile for not installing unused packages

* install basic stuff while building the container

* use prebuilt image for development

* install pylint on base image

* rename and use new kivy

* tests: added tests for Array and CorrelationTest

Both are not working due to errors

* use new thing

* use 20.04 base

* symlink pip3 to pip

* use pip instead of pip3

* equation.Expression.py v 0.0.1-alpha
added corresponding .pyc to .gitignore

* parser.py v 0.0.2-alpha

* added pyparsing to requirements.txt

* parser v 0.0.4-alpha

* Equation v 0.0.1-alpha

* added Equation to tra_analysis imports

* tests: New unit tests for submoduling (#66)

* feat: created kivy gui boilerplate

* migrated docker base image to debian

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* migrated to ubuntu

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* fixed issues

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* fix: docker build?

* fix: use ubuntu bionic

* fix: get kivy installed

* @ltcptgeneral can't spell

* optim dockerfile for not installing unused packages

* install basic stuff while building the container

* use prebuilt image for development

* install pylint on base image

* rename and use new kivy

* tests: added tests for Array and CorrelationTest

Both are not working due to errors

* fix: Array no longer has *args and CorrelationTest functions no longer have self in the arguments

* use new thing

* use 20.04 base

* symlink pip3 to pip

* use pip instead of pip3

* tra_analysis v 2.1.0-alpha.2
SVM v 1.0.1
added unvalidated SVM unit tests

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* fixed version number

Signed-off-by: ltcptgeneral <learthurgo@gmail.com>

* tests: added tests for ClassificationMetric

* partially fixed and commented out svm unit tests

* fixed some SVM unit tests

* added installing pytest to devcontainer.json

* fix: small fixes to KNN

Namely, removing self from parameters and passing correct arguments to KNeighborsClassifier constructor

* fix, test: Added tests for KNN and NaiveBayes.

Also made some small fixes in KNN, NaiveBayes, and RegressionMetric

* test: finished unit tests except for StatisticalTest

Also made various small fixes and style changes

* StatisticalTest v 1.0.1

* fixed RegressionMetric unit test
temporarily disabled CorrelationTest unit tests

* tra_analysis v 2.1.0-alpha.3

* readded __all__

* fix: floating point issues in unit tests for CorrelationTest

Co-authored-by: AGawde05 <agawde05@gmail.com>
Co-authored-by: ltcptgeneral <learthurgo@gmail.com>
Co-authored-by: Dev Singh <dev@devksingh.com>
Co-authored-by: jzpan1 <panzhenyu2014@gmail.com>

* fixed depreciated escape sequences

* ficed tests, indent, import in test_analysis

* changed version to 3.0.0
added backwards compatibility

* ficed pytest install in container

* removed GUI changes

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* incremented version to rc.1 (release candidate 1)

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* fixed NaiveBayes __changelog__

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* fix: __setitem__  == to single =

* Array v 1.0.1

* Revert "Array v 1.0.1"

This reverts commit 59783b79f7.

* Array v 1.0.1

* Array.py v 1.0.2
added more Array unit tests

* cleaned .gitignore
tra_analysis v 3.0.0-rc2

Signed-off-by: Arthur Lu <learthurgo@gmail.com>

* added *.pyc to gitignore
finished subdividing test_analysis

* feat: gui layout + basic func

* Froze and removed superscript (data-analysis)

* remove data-analysis deps install for devcontainer

* tukey pairwise comparison and multicomparison but no critical q-values

* quick patch for devcontainer.json

* better fix for devcontainer.json

* fixed some styling in StatisticalTest
removed print statement in StatisticalTest unit tests

* update analysis tests to be more effecient

* don't use loop for test_nativebayes

* removed useless secondary docker files

* tra-analysis v 3.0.0

Co-authored-by: James Pan <panzhenyu2014@gmail.com>
Co-authored-by: AGawde05 <agawde05@gmail.com>
Co-authored-by: zpan1 <72054510+zpan1@users.noreply.github.com>
Co-authored-by: Dev Singh <dev@devksingh.com>
Co-authored-by: = <=>
Co-authored-by: Dev Singh <dsingh@imsa.edu>
Co-authored-by: zpan1 <zpan@imsa.edu>
This commit is contained in:
Arthur Lu 2021-04-28 17:33:50 -07:00 committed by GitHub
parent f5a5da1ee1
commit 60beaa4563
51 changed files with 4122 additions and 2599 deletions

View File

@ -1,2 +1,7 @@
FROM python FROM ubuntu:20.04
WORKDIR ~/ WORKDIR /
RUN apt-get -y update
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends tzdata
RUN apt-get install -y python3 python3-dev git python3-pip python3-kivy python-is-python3 libgl1-mesa-dev build-essential
RUN ln -s $(which pip3) /usr/bin/pip
RUN pip install pymongo pandas numpy scipy scikit-learn matplotlib pylint kivy

View File

@ -0,0 +1,2 @@
FROM titanscout2022/tra-analysis-base:latest
WORKDIR /

View File

@ -1,7 +1,7 @@
{ {
"name": "TRA Analysis Development Environment", "name": "TRA Analysis Development Environment",
"build": { "build": {
"dockerfile": "Dockerfile", "dockerfile": "dev-dockerfile",
}, },
"settings": { "settings": {
"terminal.integrated.shell.linux": "/bin/bash", "terminal.integrated.shell.linux": "/bin/bash",
@ -24,5 +24,5 @@
"ms-python.python", "ms-python.python",
"waderyan.gitblame" "waderyan.gitblame"
], ],
"postCreateCommand": "apt install vim -y ; pip install -r data-analysis/requirements.txt ; pip install -r analysis-master/requirements.txt ; pip install pylint ; pip install tra-analysis" "postCreateCommand": "/usr/bin/pip3 install -r ${containerWorkspaceFolder}/analysis-master/requirements.txt && /usr/bin/pip3 install --no-cache-dir pylint && /usr/bin/pip3 install pytest"
} }

7
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,7 @@
Fixes #
## Proposed Changes
-
-
-

View File

@ -1,38 +0,0 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
name: Superscript Unit Tests
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.7, 3.8]
env:
working-directory: ./data-analysis/
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
working-directory: ${{ env.working-directory }}
- name: Test with pytest
run: |
pytest
working-directory: ${{ env.working-directory }}

48
.gitignore vendored
View File

@ -1,41 +1,9 @@
benchmark_data.csv /.vscode/
data-analysis/keys/keytemp.json
data-analysis/__pycache__/analysis.cpython-37.pyc
apps/android/source/app/src/main/res/drawable-v24/uuh.png
apps/android/source/app/src/main/java/com/example/titanscouting/tits.java
data-analysis/analysis.cp37-win_amd64.pyd **/__pycache__/
data-analysis/analysis/analysis.c **/.pytest_cache/
data-analysis/analysis/analysis.cp37-win_amd64.pyd **/*.pyc
data-analysis/analysis/build/temp.win-amd64-3.7/Release/analysis.cp37-win_amd64.exp
data-analysis/analysis/build/temp.win-amd64-3.7/Release/analysis.cp37-win_amd64.lib **/build/
data-analysis/analysis/build/temp.win-amd64-3.7/Release/analysis.obj **/*.egg-info/
data-analysis/test.ipynb **/dist/
data-analysis/.ipynb_checkpoints/test-checkpoint.ipynb
.vscode/settings.json
.vscode
data-analysis/arthur_pull.ipynb
data-analysis/keys.txt
data-analysis/check_for_new_matches.ipynb
data-analysis/test.ipynb
data-analysis/visualize_pit.ipynb
data-analysis/config/keys.config
analysis-master/analysis/__pycache__/
analysis-master/analysis/metrics/__pycache__/
data-analysis/__pycache__/
analysis-master/analysis.egg-info/
analysis-master/build/
analysis-master/metrics/
data-analysis/config-pop.json
data-analysis/__pycache__/
analysis-master/__pycache__/
analysis-master/.pytest_cache/
data-analysis/.pytest_cache/
data-analysis/test.py
analysis-master/tra_analysis.egg-info
analysis-master/tra_analysis/__pycache__
analysis-master/tra_analysis/.ipynb_checkpoints
.pytest_cache
analysis-master/tra_analysis/metrics/__pycache__
analysis-master/dist
data-analysis/config/

View File

@ -42,7 +42,7 @@ tra-analysis operates like any other python package. Consult the [documentation]
Although any modern 64 bit platform should be supported, the following platforms have been tested to be working: Although any modern 64 bit platform should be supported, the following platforms have been tested to be working:
* AMD64 (Tested on Zen, Zen+, and Zen 2) * AMD64 (Tested on Zen, Zen+, and Zen 2)
* Intel 64/x86_64/x64 (Tested on Kaby Lake) * Intel 64/x86_64/x64 (Tested on Kaby Lake, Ice Lake)
* ARM64 (Tested on Broadcom BCM2836 SoC, Broadcom BCM2711 SoC) * ARM64 (Tested on Broadcom BCM2836 SoC, Broadcom BCM2711 SoC)
The following OSes have been tested to be working: The following OSes have been tested to be working:
@ -60,38 +60,7 @@ The following python versions are supported:
# `data-analysis` # `data-analysis`
To facilitate data analysis of collected scouting data in a user firendly tool, we created the data-analysis application. At its core it uses the tra-analysis package to conduct any number of user selected tests on data collected from the TRA scouting app. It uploads these tests back to MongoDB where it can be viewed from the app at any time. Data analysis has been separated into its own [repository](https://github.com/titanscouting/tra-data-analysis).
The data-analysis application also uses the TRA API to interface with MongoDB and uses the TBA API to collect additional data (match win/loss).
The application can be configured with a configuration tool or by editing the config.json directly.
## Prerequisites
---
Before installing and using data-analysis, make sure that you have installed the folowing prerequisites:
- A common operating system like **Windows** or (*most*) distributions of **Linux**. BSD may work but has not been tested nor is it reccomended.
- [Python](https://www.python.org/) version **3.6** or higher
- [Pip](https://pip.pypa.io/en/stable/) (installation instructions [here](https://pip.pypa.io/en/stable/installing/))
## Installing Requirements
---
Once navigated to the data-analysis folder run `pip install -r requirements.txt` to install all of the required python libraries.
## Scripts
---
The data-analysis application is a collection of various scripts and one config file. For users, only the main application `superscript.py` and the config file `config.json` are important.
To run the data-analysis application, navigate to the data-analysis folder once all requirements have been installed and run `python superscript.py`. If you encounter the error:
`pymongo.errors.ConfigurationError: Empty host (or extra comma in host list).`
don't worry, you may have just not configured the application correctly, but would otherwise work. Refer to [the documentation](https://titanscouting.github.io/analysis/data_analysis/Config) to learn how to configure data-analysis.
# Contributing # Contributing
@ -99,4 +68,3 @@ Read our included contributing guidelines (`CONTRIBUTING.md`) for more informati
# Build Statuses # Build Statuses
![Analysis Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Analysis%20Unit%20Tests/badge.svg) ![Analysis Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Analysis%20Unit%20Tests/badge.svg)
![Superscript Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Superscript%20Unit%20Tests/badge.svg?branch=master)

View File

@ -1,5 +0,0 @@
FROM python
WORKDIR ~/
COPY ./ ./
RUN pip install -r requirements.txt
CMD ["bash"]

View File

@ -1,3 +0,0 @@
cd ..
docker build -t tra-analysis-amd64-dev -f docker/Dockerfile .
docker run -it tra-analysis-amd64-dev

View File

@ -1,6 +1,6 @@
numba
numpy numpy
scipy scipy
scikit-learn scikit-learn
six six
matplotlib matplotlib
pyparsing

View File

@ -1,4 +1,5 @@
import setuptools import setuptools
import tra_analysis
requirements = [] requirements = []
@ -8,11 +9,11 @@ with open("requirements.txt", 'r') as file:
setuptools.setup( setuptools.setup(
name="tra_analysis", name="tra_analysis",
version="2.1.0", version=tra_analysis.__version__,
author="The Titan Scouting Team", author="The Titan Scouting Team",
author_email="titanscout2022@gmail.com", author_email="titanscout2022@gmail.com",
description="Analysis package developed by Titan Scouting for The Red Alliance", description="Analysis package developed by Titan Scouting for The Red Alliance",
long_description="", long_description="../README.md",
long_description_content_type="text/markdown", long_description_content_type="text/markdown",
url="https://github.com/titanscout2022/tr2022-strategy", url="https://github.com/titanscout2022/tr2022-strategy",
packages=setuptools.find_packages(), packages=setuptools.find_packages(),
@ -21,6 +22,8 @@ setuptools.setup(
classifiers=[ classifiers=[
"Programming Language :: Python :: 3", "Programming Language :: Python :: 3",
"Operating System :: OS Independent", "Operating System :: OS Independent",
"Topic :: Data Analysis"
], ],
python_requires='>=3.6', python_requires='>=3.6',
keywords="data analysis tools"
) )

View File

@ -1,35 +1,233 @@
from tra_analysis import analysis as an import numpy as np
from tra_analysis import metrics import sklearn
from tra_analysis import fits from sklearn import metrics
from tra_analysis import Analysis as an
from tra_analysis import Array
from tra_analysis import ClassificationMetric
from tra_analysis import CorrelationTest
from tra_analysis import Fit
from tra_analysis import KNN
from tra_analysis import NaiveBayes
from tra_analysis import RandomForest
from tra_analysis import RegressionMetric
from tra_analysis import Sort
from tra_analysis import StatisticalTest
from tra_analysis import SVM
from tra_analysis.equation.parser import BNF
test_data_linear = [1, 3, 6, 7, 9]
test_data_linear2 = [2, 2, 5, 7, 13]
test_data_linear3 = [2, 5, 8, 6, 14]
test_data_array = Array(test_data_linear)
x_data_circular = []
y_data_circular = []
y_data_ccu = [1, 3, 7, 14, 21]
y_data_ccd = [1, 5, 7, 8.5, 8.66]
test_data_scrambled = [-32, 34, 19, 72, -65, -11, -43, 6, 85, -17, -98, -26, 12, 20, 9, -92, -40, 98, -78, 17, -20, 49, 93, -27, -24, -66, 40, 84, 1, -64, -68, -25, -42, -46, -76, 43, -3, 30, -14, -34, -55, -13, 41, -30, 0, -61, 48, 23, 60, 87, 80, 77, 53, 73, 79, 24, -52, 82, 8, -44, 65, 47, -77, 94, 7, 37, -79, 36, -94, 91, 59, 10, 97, -38, -67, 83, 54, 31, -95, -63, 16, -45, 21, -12, 66, -48, -18, -96, -90, -21, -83, -74, 39, 64, 69, -97, 13, 55, 27, -39]
test_data_sorted = [-98, -97, -96, -95, -94, -92, -90, -83, -79, -78, -77, -76, -74, -68, -67, -66, -65, -64, -63, -61, -55, -52, -48, -46, -45, -44, -43, -42, -40, -39, -38, -34, -32, -30, -27, -26, -25, -24, -21, -20, -18, -17, -14, -13, -12, -11, -3, 0, 1, 6, 7, 8, 9, 10, 12, 13, 16, 17, 19, 20, 21, 23, 24, 27, 30, 31, 34, 36, 37, 39, 40, 41, 43, 47, 48, 49, 53, 54, 55, 59, 60, 64, 65, 66, 69, 72, 73, 77, 79, 80, 82, 83, 84, 85, 87, 91, 93, 94, 97, 98]
test_data_2D_pairs = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
test_data_2D_positive = np.array([[23, 51], [21, 32], [15, 25], [17, 31]])
test_output = np.array([1, 3, 4, 5])
test_labels_2D_pairs = np.array([1, 1, 2, 2])
validation_data_2D_pairs = np.array([[-0.8, -1], [0.8, 1.2]])
validation_labels_2D_pairs = np.array([1, 2])
def test_basicstats():
def test_():
test_data_linear = [1, 3, 6, 7, 9]
x_data_circular = []
y_data_circular = []
y_data_ccu = [1, 3, 7, 14, 21]
y_data_ccd = [1, 5, 7, 8.5, 8.66]
test_data_scrambled = [-32, 34, 19, 72, -65, -11, -43, 6, 85, -17, -98, -26, 12, 20, 9, -92, -40, 98, -78, 17, -20, 49, 93, -27, -24, -66, 40, 84, 1, -64, -68, -25, -42, -46, -76, 43, -3, 30, -14, -34, -55, -13, 41, -30, 0, -61, 48, 23, 60, 87, 80, 77, 53, 73, 79, 24, -52, 82, 8, -44, 65, 47, -77, 94, 7, 37, -79, 36, -94, 91, 59, 10, 97, -38, -67, 83, 54, 31, -95, -63, 16, -45, 21, -12, 66, -48, -18, -96, -90, -21, -83, -74, 39, 64, 69, -97, 13, 55, 27, -39]
test_data_sorted = [-98, -97, -96, -95, -94, -92, -90, -83, -79, -78, -77, -76, -74, -68, -67, -66, -65, -64, -63, -61, -55, -52, -48, -46, -45, -44, -43, -42, -40, -39, -38, -34, -32, -30, -27, -26, -25, -24, -21, -20, -18, -17, -14, -13, -12, -11, -3, 0, 1, 6, 7, 8, 9, 10, 12, 13, 16, 17, 19, 20, 21, 23, 24, 27, 30, 31, 34, 36, 37, 39, 40, 41, 43, 47, 48, 49, 53, 54, 55, 59, 60, 64, 65, 66, 69, 72, 73, 77, 79, 80, 82, 83, 84, 85, 87, 91, 93, 94, 97, 98]
assert an.basic_stats(test_data_linear) == {"mean": 5.2, "median": 6.0, "standard-deviation": 2.85657137141714, "variance": 8.16, "minimum": 1.0, "maximum": 9.0} assert an.basic_stats(test_data_linear) == {"mean": 5.2, "median": 6.0, "standard-deviation": 2.85657137141714, "variance": 8.16, "minimum": 1.0, "maximum": 9.0}
assert an.z_score(3.2, 6, 1.5) == -1.8666666666666665 assert an.z_score(3.2, 6, 1.5) == -1.8666666666666665
assert an.z_normalize([test_data_linear], 1).tolist() == [[0.07537783614444091, 0.22613350843332272, 0.45226701686664544, 0.5276448530110863, 0.6784005252999682]] assert an.z_normalize([test_data_linear], 1).tolist() == [[0.07537783614444091, 0.22613350843332272, 0.45226701686664544, 0.5276448530110863, 0.6784005252999682]]
def test_regression():
assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["lin"])) == True assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["lin"])) == True
#assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccd, ["log"])) == True #assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccd, ["log"])) == True
#assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["exp"])) == True #assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["exp"])) == True
#assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["ply"])) == True #assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccu, ["ply"])) == True
#assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccd, ["sig"])) == True #assert all(isinstance(item, str) for item in an.regression(test_data_linear, y_data_ccd, ["sig"])) == True
def test_metrics():
assert an.Metric().elo(1500, 1500, [1, 0], 400, 24) == 1512.0 assert an.Metric().elo(1500, 1500, [1, 0], 400, 24) == 1512.0
assert an.Metric().glicko2(1500, 250, 0.06, [1500, 1400], [250, 240], [1, 0]) == (1478.864307445517, 195.99122679202452, 0.05999602937563585) assert an.Metric().glicko2(1500, 250, 0.06, [1500, 1400], [250, 240], [1, 0]) == (1478.864307445517, 195.99122679202452, 0.05999602937563585)
#assert an.Metric().trueskill([[(25, 8.33), (24, 8.25), (32, 7.5)], [(25, 8.33), (25, 8.33), (21, 6.5)]], [1, 0]) == [(metrics.trueskill.Rating(mu=21.346, sigma=7.875), metrics.trueskill.Rating(mu=20.415, sigma=7.808), metrics.trueskill.Rating(mu=29.037, sigma=7.170)), (metrics.trueskill.Rating(mu=28.654, sigma=7.875), metrics.trueskill.Rating(mu=28.654, sigma=7.875), metrics.trueskill.Rating(mu=23.225, sigma=6.287))] #assert an.Metric().trueskill([[(25, 8.33), (24, 8.25), (32, 7.5)], [(25, 8.33), (25, 8.33), (21, 6.5)]], [1, 0]) == [(metrics.trueskill.Rating(mu=21.346, sigma=7.875), metrics.trueskill.Rating(mu=20.415, sigma=7.808), metrics.trueskill.Rating(mu=29.037, sigma=7.170)), (metrics.trueskill.Rating(mu=28.654, sigma=7.875), metrics.trueskill.Rating(mu=28.654, sigma=7.875), metrics.trueskill.Rating(mu=23.225, sigma=6.287))]
assert all(a == b for a, b in zip(an.Sort().quicksort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().mergesort(test_data_scrambled), test_data_sorted)) def test_array():
assert all(a == b for a, b in zip(an.Sort().introsort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().heapsort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_mean() == 5.2
assert all(a == b for a, b in zip(an.Sort().insertionsort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_median() == 6.0
assert all(a == b for a, b in zip(an.Sort().timsort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_stdev() == 2.85657137141714
assert all(a == b for a, b in zip(an.Sort().selectionsort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_variance() == 8.16
assert all(a == b for a, b in zip(an.Sort().shellsort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_npmin() == 1
assert all(a == b for a, b in zip(an.Sort().bubblesort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_npmax() == 9
assert all(a == b for a, b in zip(an.Sort().cyclesort(test_data_scrambled), test_data_sorted)) assert test_data_array.elementwise_stats() == (5.2, 6.0, 2.85657137141714, 8.16, 1, 9)
assert all(a == b for a, b in zip(an.Sort().cocktailsort(test_data_scrambled), test_data_sorted))
assert fits.CircleFit(x=[0,0,-1,1], y=[1, -1, 0, 0]).LSC() == (0.0, 0.0, 1.0, 0.0) for i in range(len(test_data_array)):
assert test_data_array[i] == test_data_linear[i]
test_data_array[0] = 100
expected = [100, 3, 6, 7, 9]
for i in range(len(test_data_array)):
assert test_data_array[i] == expected[i]
def test_classifmetric():
classif_metric = ClassificationMetric(test_data_linear2, test_data_linear)
assert classif_metric[0].all() == metrics.confusion_matrix(test_data_linear, test_data_linear2).all()
assert classif_metric[1] == metrics.classification_report(test_data_linear, test_data_linear2)
def test_correlationtest():
assert all(np.isclose(list(CorrelationTest.anova_oneway(test_data_linear, test_data_linear2).values()), [0.05825242718446602, 0.8153507906592907], rtol=1e-10))
assert all(np.isclose(list(CorrelationTest.pearson(test_data_linear, test_data_linear2).values()), [0.9153061540753287, 0.02920895440940868], rtol=1e-10))
assert all(np.isclose(list(CorrelationTest.spearman(test_data_linear, test_data_linear2).values()), [0.9746794344808964, 0.004818230468198537], rtol=1e-10))
assert all(np.isclose(list(CorrelationTest.point_biserial(test_data_linear, test_data_linear2).values()), [0.9153061540753287, 0.02920895440940868], rtol=1e-10))
assert all(np.isclose(list(CorrelationTest.kendall(test_data_linear, test_data_linear2).values()), [0.9486832980505137, 0.022977401503206086], rtol=1e-10))
assert all(np.isclose(list(CorrelationTest.kendall_weighted(test_data_linear, test_data_linear2).values()), [0.9750538072369643, np.nan], rtol=1e-10, equal_nan=True))
def test_fit():
assert Fit.CircleFit(x=[0,0,-1,1], y=[1, -1, 0, 0]).LSC() == (0.0, 0.0, 1.0, 0.0)
def test_knn():
model, metric = KNN.knn_classifier(test_data_2D_pairs, test_labels_2D_pairs, 2)
assert isinstance(model, sklearn.neighbors.KNeighborsClassifier)
assert np.array([[0,0], [2,0]]).all() == metric[0].all()
assert ' precision recall f1-score support\n\n 1 0.00 0.00 0.00 0.0\n 2 0.00 0.00 0.00 2.0\n\n accuracy 0.00 2.0\n macro avg 0.00 0.00 0.00 2.0\nweighted avg 0.00 0.00 0.00 2.0\n' == metric[1]
model, metric = KNN.knn_regressor(test_data_2D_pairs, test_output, 2)
assert isinstance(model, sklearn.neighbors.KNeighborsRegressor)
assert (-25.0, 6.5, 2.5495097567963922) == metric
def test_naivebayes():
model, metric = NaiveBayes.gaussian(test_data_2D_pairs, test_labels_2D_pairs)
assert isinstance(model, sklearn.naive_bayes.GaussianNB)
assert metric[0].all() == np.array([[0, 0], [2, 0]]).all()
model, metric = NaiveBayes.multinomial(test_data_2D_positive, test_labels_2D_pairs)
assert isinstance(model, sklearn.naive_bayes.MultinomialNB)
assert metric[0].all() == np.array([[0, 0], [2, 0]]).all()
model, metric = NaiveBayes.bernoulli(test_data_2D_pairs, test_labels_2D_pairs)
assert isinstance(model, sklearn.naive_bayes.BernoulliNB)
assert metric[0].all() == np.array([[0, 0], [2, 0]]).all()
model, metric = NaiveBayes.complement(test_data_2D_positive, test_labels_2D_pairs)
assert isinstance(model, sklearn.naive_bayes.ComplementNB)
assert metric[0].all() == np.array([[0, 0], [2, 0]]).all()
def test_randomforest():
model, metric = RandomForest.random_forest_classifier(test_data_2D_pairs, test_labels_2D_pairs, 0.3, 2)
assert isinstance(model, sklearn.ensemble.RandomForestClassifier)
assert metric[0].all() == np.array([[0, 0], [2, 0]]).all()
model, metric = RandomForest.random_forest_regressor(test_data_2D_pairs, test_labels_2D_pairs, 0.3, 2)
assert isinstance(model, sklearn.ensemble.RandomForestRegressor)
assert metric == (0.0, 1.0, 1.0)
def test_regressionmetric():
assert RegressionMetric(test_data_linear, test_data_linear2)== (0.7705314009661837, 3.8, 1.9493588689617927)
def test_sort():
sorts = [Sort.quicksort, Sort.mergesort, Sort.heapsort, Sort.introsort, Sort.insertionsort, Sort.timsort, Sort.selectionsort, Sort.shellsort, Sort.bubblesort, Sort.cyclesort, Sort.cocktailsort]
for sort in sorts:
assert all(a == b for a, b in zip(sort(test_data_scrambled), test_data_sorted))
def test_statisticaltest():
#print(StatisticalTest.tukey_multicomparison([test_data_linear, test_data_linear2, test_data_linear3]))
assert StatisticalTest.tukey_multicomparison([test_data_linear, test_data_linear2, test_data_linear3]) == \
{'group 1 and group 2': [0.32571517201527916, False], 'group 1 and group 3': [0.977145516045838, False], 'group 2 and group 3': [0.6514303440305589, False]}
#assert all(np.isclose([i[0] for i in list(StatisticalTest.tukey_multicomparison([test_data_linear, test_data_linear2, test_data_linear3]).values],
# [0.32571517201527916, 0.977145516045838, 0.6514303440305589]))
#assert [i[1] for i in StatisticalTest.tukey_multicomparison([test_data_linear, test_data_linear2, test_data_linear3]).values] == \
# [False, False, False]
def test_svm():
data = test_data_2D_pairs
labels = test_labels_2D_pairs
test_data = validation_data_2D_pairs
test_labels = validation_labels_2D_pairs
lin_kernel = SVM.PrebuiltKernel.Linear()
#ply_kernel = SVM.PrebuiltKernel.Polynomial(3, 0)
rbf_kernel = SVM.PrebuiltKernel.RBF('scale')
sig_kernel = SVM.PrebuiltKernel.Sigmoid(0)
lin_kernel = SVM.fit(lin_kernel, data, labels)
#ply_kernel = SVM.fit(ply_kernel, data, labels)
rbf_kernel = SVM.fit(rbf_kernel, data, labels)
sig_kernel = SVM.fit(sig_kernel, data, labels)
for i in range(len(test_data)):
assert lin_kernel.predict([test_data[i]]).tolist() == [test_labels[i]]
#for i in range(len(test_data)):
# assert ply_kernel.predict([test_data[i]]).tolist() == [test_labels[i]]
for i in range(len(test_data)):
assert rbf_kernel.predict([test_data[i]]).tolist() == [test_labels[i]]
for i in range(len(test_data)):
assert sig_kernel.predict([test_data[i]]).tolist() == [test_labels[i]]
def test_equation():
parser = BNF()
correctParse = {
"9": 9.0,
"-9": -9.0,
"--9": 9.0,
"-E": -2.718281828459045,
"9 + 3 + 6": 18.0,
"9 + 3 / 11": 9.272727272727273,
"(9 + 3)": 12.0,
"(9+3) / 11": 1.0909090909090908,
"9 - 12 - 6": -9.0,
"9 - (12 - 6)": 3.0,
"2*3.14159": 6.28318,
"3.1415926535*3.1415926535 / 10": 0.9869604400525172,
"PI * PI / 10": 0.9869604401089358,
"PI*PI/10": 0.9869604401089358,
"PI^2": 9.869604401089358,
"round(PI^2)": 10,
"6.02E23 * 8.048": 4.844896e+24,
"e / 3": 0.9060939428196817,
"sin(PI/2)": 1.0,
"10+sin(PI/4)^2": 10.5,
"trunc(E)": 2,
"trunc(-E)": -2,
"round(E)": 3,
"round(-E)": -3,
"E^PI": 23.140692632779263,
"exp(0)": 1.0,
"exp(1)": 2.718281828459045,
"2^3^2": 512.0,
"(2^3)^2": 64.0,
"2^3+2": 10.0,
"2^3+5": 13.0,
"2^9": 512.0,
"sgn(-2)": -1,
"sgn(0)": 0,
"sgn(0.1)": 1,
"sgn(cos(PI/4))": 1,
"sgn(cos(PI/2))": 0,
"sgn(cos(PI*3/4))": -1,
"+(sgn(cos(PI/4)))": 1,
"-(sgn(cos(PI/4)))": -1,
}
for key in list(correctParse.keys()):
assert parser.eval(key) == correctParse[key]

View File

@ -1,35 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"string = \"3+4+5\"\n",
"re.sub(\"\\d+[+]{1}\\d+\", string, sum([int(i) for i in re.split(\"[+]{1}\", re.search(\"\\d+[+]{1}\\d+\", string).group())]))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@ -0,0 +1,628 @@
# Titan Robotics Team 2022: Analysis Module
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import Analysis'
# this should be included in the local directory or environment variable
# this module has been optimized for multhreaded computing
# current benchmark of optimization: 1.33 times faster
# setup:
__version__ = "3.0.2"
# changelog should be viewed using print(analysis.__changelog__)
__changelog__ = """changelog:
3.0.2:
- fixed __all__
3.0.1:
- removed numba dependency and calls
3.0.0:
- exported several submodules to their own files while preserving backwards compatibility:
- Array
- ClassificationMetric
- CorrelationTest
- KNN
- NaiveBayes
- RandomForest
- RegressionMetric
- Sort
- StatisticalTest
- SVM
- note: above listed submodules will not be supported in the future
- future changes to all submodules will be held in their respective changelogs
- future changes altering the parent package will be held in the __changelog__ of the parent package (in __init__.py)
- changed reference to module name to Analysis
2.3.1:
- fixed bugs in Array class
2.3.0:
- overhauled Array class
2.2.3:
- fixed spelling of RandomForest
- made n_neighbors required for KNN
- made n_classifiers required for SVM
2.2.2:
- fixed 2.2.1 changelog entry
- changed regression to return dictionary
2.2.1:
- changed all references to parent package analysis to tra_analysis
2.2.0:
- added Sort class
- added several array sorting functions to Sort class including:
- quick sort
- merge sort
- intro(spective) sort
- heap sort
- insertion sort
- tim sort
- selection sort
- bubble sort
- cycle sort
- cocktail sort
- tested all sorting algorithms with both lists and numpy arrays
- depreciated sort function from Array class
- added warnings as an import
2.1.4:
- added sort and search functions to Array class
2.1.3:
- changed output of basic_stats and histo_analysis to libraries
- fixed __all__
2.1.2:
- renamed ArrayTest class to Array
2.1.1:
- added add, mul, neg, and inv functions to ArrayTest class
- added normalize function to ArrayTest class
- added dot and cross functions to ArrayTest class
2.1.0:
- added ArrayTest class
- added elementwise mean, median, standard deviation, variance, min, max functions to ArrayTest class
- added elementwise_stats to ArrayTest which encapsulates elementwise statistics
- appended to __all__ to reflect changes
2.0.6:
- renamed func functions in regression to lin, log, exp, and sig
2.0.5:
- moved random_forrest_regressor and random_forrest_classifier to RandomForrest class
- renamed Metrics to Metric
- renamed RegressionMetrics to RegressionMetric
- renamed ClassificationMetrics to ClassificationMetric
- renamed CorrelationTests to CorrelationTest
- renamed StatisticalTests to StatisticalTest
- reflected rafactoring to all mentions of above classes/functions
2.0.4:
- fixed __all__ to reflected the correct functions and classes
- fixed CorrelationTests and StatisticalTests class functions to require self invocation
- added missing math import
- fixed KNN class functions to require self invocation
- fixed Metrics class functions to require self invocation
- various spelling fixes in CorrelationTests and StatisticalTests
2.0.3:
- bug fixes with CorrelationTests and StatisticalTests
- moved glicko2 and trueskill to the metrics subpackage
- moved elo to a new metrics subpackage
2.0.2:
- fixed docs
2.0.1:
- fixed docs
2.0.0:
- cleaned up wild card imports with scipy and sklearn
- added CorrelationTests class
- added StatisticalTests class
- added several correlation tests to CorrelationTests
- added several statistical tests to StatisticalTests
1.13.9:
- moved elo, glicko2, trueskill functions under class Metrics
1.13.8:
- moved Glicko2 to a seperate package
1.13.7:
- fixed bug with trueskill
1.13.6:
- cleaned up imports
1.13.5:
- cleaned up package
1.13.4:
- small fixes to regression to improve performance
1.13.3:
- filtered nans from regression
1.13.2:
- removed torch requirement, and moved Regression back to regression.py
1.13.1:
- bug fix with linear regression not returning a proper value
- cleaned up regression
- fixed bug with polynomial regressions
1.13.0:
- fixed all regressions to now properly work
1.12.6:
- fixed bg with a division by zero in histo_analysis
1.12.5:
- fixed numba issues by removing numba from elo, glicko2 and trueskill
1.12.4:
- renamed gliko to glicko
1.12.3:
- removed depreciated code
1.12.2:
- removed team first time trueskill instantiation in favor of integration in superscript.py
1.12.1:
- improved readibility of regression outputs by stripping tensor data
- used map with lambda to acheive the improved readibility
- lost numba jit support with regression, and generated_jit hangs at execution
- TODO: reimplement correct numba integration in regression
1.12.0:
- temporarily fixed polynomial regressions by using sklearn's PolynomialFeatures
1.11.010:
- alphabeticaly ordered import lists
1.11.9:
- bug fixes
1.11.8:
- bug fixes
1.11.7:
- bug fixes
1.11.6:
- tested min and max
- bug fixes
1.11.5:
- added min and max in basic_stats
1.11.4:
- bug fixes
1.11.3:
- bug fixes
1.11.2:
- consolidated metrics
- fixed __all__
1.11.1:
- added test/train split to RandomForestClassifier and RandomForestRegressor
1.11.0:
- added RandomForestClassifier and RandomForestRegressor
- note: untested
1.10.0:
- added numba.jit to remaining functions
1.9.2:
- kernelized PCA and KNN
1.9.1:
- fixed bugs with SVM and NaiveBayes
1.9.0:
- added SVM class, subclasses, and functions
- note: untested
1.8.0:
- added NaiveBayes classification engine
- note: untested
1.7.0:
- added knn()
- added confusion matrix to decisiontree()
1.6.2:
- changed layout of __changelog to be vscode friendly
1.6.1:
- added additional hyperparameters to decisiontree()
1.6.0:
- fixed __version__
- fixed __all__ order
- added decisiontree()
1.5.3:
- added pca
1.5.2:
- reduced import list
- added kmeans clustering engine
1.5.1:
- simplified regression by using .to(device)
1.5.0:
- added polynomial regression to regression(); untested
1.4.0:
- added trueskill()
1.3.2:
- renamed regression class to Regression, regression_engine() to regression gliko2_engine class to Gliko2
1.3.1:
- changed glicko2() to return tuple instead of array
1.3.0:
- added glicko2_engine class and glicko()
- verified glicko2() accuracy
1.2.3:
- fixed elo()
1.2.2:
- added elo()
- elo() has bugs to be fixed
1.2.1:
- readded regrression import
1.2.0:
- integrated regression.py as regression class
- removed regression import
- fixed metadata for regression class
- fixed metadata for analysis class
1.1.1:
- regression_engine() bug fixes, now actaully regresses
1.1.0:
- added regression_engine()
- added all regressions except polynomial
1.0.7:
- updated _init_device()
1.0.6:
- removed useless try statements
1.0.5:
- removed impossible outcomes
1.0.4:
- added performance metrics (r^2, mse, rms)
1.0.3:
- resolved nopython mode for mean, median, stdev, variance
1.0.2:
- snapped (removed) majority of uneeded imports
- forced object mode (bad) on all jit
- TODO: stop numba complaining about not being able to compile in nopython mode
1.0.1:
- removed from sklearn import * to resolve uneeded wildcard imports
1.0.0:
- removed c_entities,nc_entities,obstacles,objectives from __all__
- applied numba.jit to all functions
- depreciated and removed stdev_z_split
- cleaned up histo_analysis to include numpy and numba.jit optimizations
- depreciated and removed all regression functions in favor of future pytorch optimizer
- depreciated and removed all nonessential functions (basic_analysis, benchmark, strip_data)
- optimized z_normalize using sklearn.preprocessing.normalize
- TODO: implement kernel/function based pytorch regression optimizer
0.9.0:
- refactored
- numpyed everything
- removed stats in favor of numpy functions
0.8.5:
- minor fixes
0.8.4:
- removed a few unused dependencies
0.8.3:
- added p_value function
0.8.2:
- updated __all__ correctly to contain changes made in v 0.8.0 and v 0.8.1
0.8.1:
- refactors
- bugfixes
0.8.0:
- depreciated histo_analysis_old
- depreciated debug
- altered basic_analysis to take array data instead of filepath
- refactor
- optimization
0.7.2:
- bug fixes
0.7.1:
- bug fixes
0.7.0:
- added tanh_regression (logistical regression)
- bug fixes
0.6.5:
- added z_normalize function to normalize dataset
- bug fixes
0.6.4:
- bug fixes
0.6.3:
- bug fixes
0.6.2:
- bug fixes
0.6.1:
- corrected __all__ to contain all of the functions
0.6.0:
- added calc_overfit, which calculates two measures of overfit, error and performance
- added calculating overfit to optimize_regression
0.5.0:
- added optimize_regression function, which is a sample function to find the optimal regressions
- optimize_regression function filters out some overfit funtions (functions with r^2 = 1)
- planned addition: overfit detection in the optimize_regression function
0.4.2:
- added __changelog__
- updated debug function with log and exponential regressions
0.4.1:
- added log regressions
- added exponential regressions
- added log_regression and exp_regression to __all__
0.3.8:
- added debug function to further consolidate functions
0.3.7:
- added builtin benchmark function
- added builtin random (linear) data generation function
- added device initialization (_init_device)
0.3.6:
- reorganized the imports list to be in alphabetical order
- added search and regurgitate functions to c_entities, nc_entities, obstacles, objectives
0.3.5:
- major bug fixes
- updated historical analysis
- depreciated old historical analysis
0.3.4:
- added __version__, __author__, __all__
- added polynomial regression
- added root mean squared function
- added r squared function
0.3.3:
- bug fixes
- added c_entities
0.3.2:
- bug fixes
- added nc_entities, obstacles, objectives
- consolidated statistics.py to analysis.py
0.3.1:
- compiled 1d, column, and row basic stats into basic stats function
0.3.0:
- added historical analysis function
0.2.x:
- added z score test
0.1.x:
- major bug fixes
0.0.x:
- added loading csv
- added 1d, column, row basic stats
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
'load_csv',
'basic_stats',
'z_score',
'z_normalize',
'histo_analysis',
'regression',
'Metric',
'kmeans',
'pca',
'decisiontree',
# all statistics functions left out due to integration in other functions
]
# now back to your regularly scheduled programming:
# imports (now in alphabetical order! v 0.3.006):
import csv
from tra_analysis.metrics import elo as Elo
from tra_analysis.metrics import glicko2 as Glicko2
import math
import numpy as np
import scipy
from scipy import optimize, stats
import sklearn
from sklearn import preprocessing, pipeline, linear_model, metrics, cluster, decomposition, tree, neighbors, naive_bayes, svm, model_selection, ensemble
from tra_analysis.metrics import trueskill as Trueskill
import warnings
# import submodules
from .Array import Array
from .ClassificationMetric import ClassificationMetric
from .CorrelationTest_obj import CorrelationTest
from .KNN_obj import KNN
from .NaiveBayes_obj import NaiveBayes
from .RandomForest_obj import RandomForest
from .RegressionMetric import RegressionMetric
from .Sort_obj import Sort
from .StatisticalTest_obj import StatisticalTest
from . import SVM
class error(ValueError):
pass
def load_csv(filepath):
with open(filepath, newline='') as csvfile:
file_array = np.array(list(csv.reader(csvfile)))
csvfile.close()
return file_array
# expects 1d array
def basic_stats(data):
data_t = np.array(data).astype(float)
_mean = mean(data_t)
_median = median(data_t)
_stdev = stdev(data_t)
_variance = variance(data_t)
_min = npmin(data_t)
_max = npmax(data_t)
return {"mean": _mean, "median": _median, "standard-deviation": _stdev, "variance": _variance, "minimum": _min, "maximum": _max}
# returns z score with inputs of point, mean and standard deviation of spread
def z_score(point, mean, stdev):
score = (point - mean) / stdev
return score
# expects 2d array, normalizes across all axes
def z_normalize(array, *args):
array = np.array(array)
for arg in args:
array = sklearn.preprocessing.normalize(array, axis = arg)
return array
# expects 2d array of [x,y]
def histo_analysis(hist_data):
if len(hist_data[0]) > 2:
hist_data = np.array(hist_data)
derivative = np.array(len(hist_data) - 1, dtype = float)
t = np.diff(hist_data)
derivative = t[1] / t[0]
np.sort(derivative)
return {"mean": basic_stats(derivative)["mean"], "deviation": basic_stats(derivative)["standard-deviation"]}
else:
return None
def regression(inputs, outputs, args): # inputs, outputs expects N-D array
X = np.array(inputs)
y = np.array(outputs)
regressions = {}
if 'lin' in args: # formula: ax + b
try:
def lin(x, a, b):
return a * x + b
popt, pcov = scipy.optimize.curve_fit(lin, X, y)
coeffs = popt.flatten().tolist()
regressions["lin"] = (str(coeffs[0]) + "*x+" + str(coeffs[1]))
except Exception as e:
pass
if 'log' in args: # formula: a log (b(x + c)) + d
try:
def log(x, a, b, c, d):
return a * np.log(b*(x + c)) + d
popt, pcov = scipy.optimize.curve_fit(log, X, y)
coeffs = popt.flatten().tolist()
regressions["log"] = (str(coeffs[0]) + "*log(" + str(coeffs[1]) + "*(x+" + str(coeffs[2]) + "))+" + str(coeffs[3]))
except Exception as e:
pass
if 'exp' in args: # formula: a e ^ (b(x + c)) + d
try:
def exp(x, a, b, c, d):
return a * np.exp(b*(x + c)) + d
popt, pcov = scipy.optimize.curve_fit(exp, X, y)
coeffs = popt.flatten().tolist()
regressions["exp"] = (str(coeffs[0]) + "*e^(" + str(coeffs[1]) + "*(x+" + str(coeffs[2]) + "))+" + str(coeffs[3]))
except Exception as e:
pass
if 'ply' in args: # formula: a + bx^1 + cx^2 + dx^3 + ...
inputs = np.array([inputs])
outputs = np.array([outputs])
plys = {}
limit = len(outputs[0])
for i in range(2, limit):
model = sklearn.preprocessing.PolynomialFeatures(degree = i)
model = sklearn.pipeline.make_pipeline(model, sklearn.linear_model.LinearRegression())
model = model.fit(np.rot90(inputs), np.rot90(outputs))
params = model.steps[1][1].intercept_.tolist()
params = np.append(params, model.steps[1][1].coef_[0].tolist()[1::])
params = params.flatten().tolist()
temp = ""
counter = 0
for param in params:
temp += "(" + str(param) + "*x^" + str(counter) + ")"
counter += 1
plys["x^" + str(i)] = (temp)
regressions["ply"] = (plys)
if 'sig' in args: # formula: a tanh (b(x + c)) + d
try:
def sig(x, a, b, c, d):
return a * np.tanh(b*(x + c)) + d
popt, pcov = scipy.optimize.curve_fit(sig, X, y)
coeffs = popt.flatten().tolist()
regressions["sig"] = (str(coeffs[0]) + "*tanh(" + str(coeffs[1]) + "*(x+" + str(coeffs[2]) + "))+" + str(coeffs[3]))
except Exception as e:
pass
return regressions
class Metric:
def elo(self, starting_score, opposing_score, observed, N, K):
return Elo.calculate(starting_score, opposing_score, observed, N, K)
def glicko2(self, starting_score, starting_rd, starting_vol, opposing_score, opposing_rd, observations):
player = Glicko2.Glicko2(rating = starting_score, rd = starting_rd, vol = starting_vol)
player.update_player([x for x in opposing_score], [x for x in opposing_rd], observations)
return (player.rating, player.rd, player.vol)
def trueskill(self, teams_data, observations): # teams_data is array of array of tuples ie. [[(mu, sigma), (mu, sigma), (mu, sigma)], [(mu, sigma), (mu, sigma), (mu, sigma)]]
team_ratings = []
for team in teams_data:
team_temp = ()
for player in team:
player = Trueskill.Rating(player[0], player[1])
team_temp = team_temp + (player,)
team_ratings.append(team_temp)
return Trueskill.rate(team_ratings, ranks=observations)
def mean(data):
return np.mean(data)
def median(data):
return np.median(data)
def stdev(data):
return np.std(data)
def variance(data):
return np.var(data)
def npmin(data):
return np.amin(data)
def npmax(data):
return np.amax(data)
def kmeans(data, n_clusters=8, init="k-means++", n_init=10, max_iter=300, tol=0.0001, precompute_distances="auto", verbose=0, random_state=None, copy_x=True, n_jobs=None, algorithm="auto"):
kernel = sklearn.cluster.KMeans(n_clusters = n_clusters, init = init, n_init = n_init, max_iter = max_iter, tol = tol, precompute_distances = precompute_distances, verbose = verbose, random_state = random_state, copy_x = copy_x, n_jobs = n_jobs, algorithm = algorithm)
kernel.fit(data)
predictions = kernel.predict(data)
centers = kernel.cluster_centers_
return centers, predictions
def pca(data, n_components = None, copy = True, whiten = False, svd_solver = "auto", tol = 0.0, iterated_power = "auto", random_state = None):
kernel = sklearn.decomposition.PCA(n_components = n_components, copy = copy, whiten = whiten, svd_solver = svd_solver, tol = tol, iterated_power = iterated_power, random_state = random_state)
return kernel.fit_transform(data)
def decisiontree(data, labels, test_size = 0.3, criterion = "gini", splitter = "default", max_depth = None): #expects *2d data and 1d labels
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.tree.DecisionTreeClassifier(criterion = criterion, splitter = splitter, max_depth = max_depth)
model = model.fit(data_train,labels_train)
predictions = model.predict(data_test)
metrics = ClassificationMetric(predictions, labels_test)
return model, metrics

View File

@ -0,0 +1,164 @@
# Titan Robotics Team 2022: Array submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import Array'
# setup:
__version__ = "1.0.3"
__changelog__ = """changelog:
1.0.3:
- fixed __all__
1.0.2:
- fixed several implementation bugs with magic methods
1.0.1:
- removed search and __search functions
1.0.0:
- ported analysis.Array() here
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
"Array",
]
import numpy as np
import warnings
class Array(): # tests on nd arrays independent of basic_stats
def __init__(self, narray):
self.array = np.array(narray)
def __str__(self):
return str(self.array)
def __repr__(self):
return str(self.array)
def elementwise_mean(self, axis = 0): # expects arrays that are size normalized
return np.mean(self.array, axis = axis)
def elementwise_median(self, axis = 0):
return np.median(self.array, axis = axis)
def elementwise_stdev(self, axis = 0):
return np.std(self.array, axis = axis)
def elementwise_variance(self, axis = 0):
return np.var(self.array, axis = axis)
def elementwise_npmin(self, axis = 0):
return np.amin(self.array, axis = axis)
def elementwise_npmax(self, axis = 0):
return np.amax(self.array, axis = axis)
def elementwise_stats(self, axis = 0):
_mean = self.elementwise_mean(axis = axis)
_median = self.elementwise_median(axis = axis)
_stdev = self.elementwise_stdev(axis = axis)
_variance = self.elementwise_variance(axis = axis)
_min = self.elementwise_npmin(axis = axis)
_max = self.elementwise_npmax(axis = axis)
return _mean, _median, _stdev, _variance, _min, _max
def __getitem__(self, key):
return self.array[key]
def __setitem__(self, key, value):
self.array[key] = value
def __len__(self):
return len(self.array)
def normalize(self):
a = np.atleast_1d(np.linalg.norm(self.array))
a[a==0] = 1
return Array(self.array / np.expand_dims(a, -1))
def __add__(self, other):
return Array(self.array + other.array)
def __sub__(self, other):
return Array(self.array - other.array)
def __neg__(self):
return Array(-self.array)
def __abs__(self):
return Array(abs(self.array))
def __invert__(self):
return Array(1/self.array)
def __mul__(self, other):
if(isinstance(other, Array)):
return Array(self.array.dot(other.array))
elif(isinstance(other, int)):
return Array(other * self.array)
else:
raise Exception("unsupported multiplication between Array and " + str(type(other)))
def __rmul__(self, other):
return self.__mul__(other)
def cross(self, other):
return np.cross(self.array, other.array)
def transpose(self):
return Array(np.transpose(self.array))
def sort(self, array): # depreciated
warnings.warn("Array.sort has been depreciated in favor of Sort")
array_length = len(array)
if array_length <= 1:
return array
middle_index = int(array_length / 2)
left = array[0:middle_index]
right = array[middle_index:]
left = self.sort(left)
right = self.sort(right)
return self.__merge(left, right)
def __merge(self, left, right):
sorted_list = []
left = left[:]
right = right[:]
while len(left) > 0 or len(right) > 0:
if len(left) > 0 and len(right) > 0:
if left[0] <= right[0]:
sorted_list.append(left.pop(0))
else:
sorted_list.append(right.pop(0))
elif len(left) > 0:
sorted_list.append(left.pop(0))
elif len(right) > 0:
sorted_list.append(right.pop(0))
return sorted_list

View File

@ -0,0 +1,39 @@
# Titan Robotics Team 2022: ClassificationMetric submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import ClassificationMetric'
# setup:
__version__ = "1.0.1"
__changelog__ = """changelog:
1.0.1:
- fixed __all__
1.0.0:
- ported analysis.ClassificationMetric() here
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
"ClassificationMetric",
]
import sklearn
from sklearn import metrics
class ClassificationMetric():
def __new__(cls, predictions, targets):
return cls.cm(cls, predictions, targets), cls.cr(cls, predictions, targets)
def cm(self, predictions, targets):
return sklearn.metrics.confusion_matrix(targets, predictions)
def cr(self, predictions, targets):
return sklearn.metrics.classification_report(targets, predictions)

View File

@ -0,0 +1,67 @@
# Titan Robotics Team 2022: CorrelationTest submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import CorrelationTest'
# setup:
__version__ = "1.0.1"
__changelog__ = """changelog:
1.0.1:
- fixed __all__
1.0.0:
- ported analysis.CorrelationTest() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
"anova_oneway",
"pearson",
"spearman",
"point_biserial",
"kendall",
"kendall_weighted",
"mgc",
]
import scipy
from scipy import stats
def anova_oneway(*args): #expects arrays of samples
results = scipy.stats.f_oneway(*args)
return {"f-value": results[0], "p-value": results[1]}
def pearson(x, y):
results = scipy.stats.pearsonr(x, y)
return {"r-value": results[0], "p-value": results[1]}
def spearman(a, b = None, axis = 0, nan_policy = 'propagate'):
results = scipy.stats.spearmanr(a, b = b, axis = axis, nan_policy = nan_policy)
return {"r-value": results[0], "p-value": results[1]}
def point_biserial(x, y):
results = scipy.stats.pointbiserialr(x, y)
return {"r-value": results[0], "p-value": results[1]}
def kendall(x, y, initial_lexsort = None, nan_policy = 'propagate', method = 'auto'):
results = scipy.stats.kendalltau(x, y, initial_lexsort = initial_lexsort, nan_policy = nan_policy, method = method)
return {"tau": results[0], "p-value": results[1]}
def kendall_weighted(x, y, rank = True, weigher = None, additive = True):
results = scipy.stats.weightedtau(x, y, rank = rank, weigher = weigher, additive = additive)
return {"tau": results[0], "p-value": results[1]}
def mgc(x, y, compute_distance = None, reps = 1000, workers = 1, is_twosamp = False, random_state = None):
results = scipy.stats.multiscale_graphcorr(x, y, compute_distance = compute_distance, reps = reps, workers = workers, is_twosamp = is_twosamp, random_state = random_state)
return {"k-value": results[0], "p-value": results[1], "data": results[2]} # unsure if MGC test returns a k value

View File

@ -0,0 +1,41 @@
# Only included for backwards compatibility! Do not update, CorrelationTest is preferred and supported.
import scipy
from scipy import stats
class CorrelationTest:
def anova_oneway(self, *args): #expects arrays of samples
results = scipy.stats.f_oneway(*args)
return {"f-value": results[0], "p-value": results[1]}
def pearson(self, x, y):
results = scipy.stats.pearsonr(x, y)
return {"r-value": results[0], "p-value": results[1]}
def spearman(self, a, b = None, axis = 0, nan_policy = 'propagate'):
results = scipy.stats.spearmanr(a, b = b, axis = axis, nan_policy = nan_policy)
return {"r-value": results[0], "p-value": results[1]}
def point_biserial(self, x,y):
results = scipy.stats.pointbiserialr(x, y)
return {"r-value": results[0], "p-value": results[1]}
def kendall(self, x, y, initial_lexsort = None, nan_policy = 'propagate', method = 'auto'):
results = scipy.stats.kendalltau(x, y, initial_lexsort = initial_lexsort, nan_policy = nan_policy, method = method)
return {"tau": results[0], "p-value": results[1]}
def kendall_weighted(self, x, y, rank = True, weigher = None, additive = True):
results = scipy.stats.weightedtau(x, y, rank = rank, weigher = weigher, additive = additive)
return {"tau": results[0], "p-value": results[1]}
def mgc(self, x, y, compute_distance = None, reps = 1000, workers = 1, is_twosamp = False, random_state = None):
results = scipy.stats.multiscale_graphcorr(x, y, compute_distance = compute_distance, reps = reps, workers = workers, is_twosamp = is_twosamp, random_state = random_state)
return {"k-value": results[0], "p-value": results[1], "data": results[2]} # unsure if MGC test returns a k value

View File

@ -4,10 +4,12 @@
# this module is cuda-optimized (as appropriate) and vectorized (except for one small part) # this module is cuda-optimized (as appropriate) and vectorized (except for one small part)
# setup: # setup:
__version__ = "0.0.1" __version__ = "0.0.2"
# changelog should be viewed using print(analysis.fits.__changelog__) # changelog should be viewed using print(analysis.fits.__changelog__)
__changelog__ = """changelog: __changelog__ = """changelog:
0.0.2:
- renamed module to Fit
0.0.1: 0.0.1:
- initial release, add circle fitting with LSC - initial release, add circle fitting with LSC
""" """

View File

@ -0,0 +1,45 @@
# Titan Robotics Team 2022: KNN submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import KNN'
# setup:
__version__ = "1.0.0"
__changelog__ = """changelog:
1.0.0:
- ported analysis.KNN() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
"James Pan <zpan@imsa.edu>"
)
__all__ = [
'knn_classifier',
'knn_regressor'
]
import sklearn
from sklearn import model_selection, neighbors
from . import ClassificationMetric, RegressionMetric
def knn_classifier(data, labels, n_neighbors = 5, test_size = 0.3, algorithm='auto', leaf_size=30, metric='minkowski', metric_params=None, n_jobs=None, p=2, weights='uniform'): #expects *2d data and 1d labels post-scaling
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.neighbors.KNeighborsClassifier(n_neighbors = n_neighbors, weights = weights, algorithm = algorithm, leaf_size = leaf_size, p = p, metric = metric, metric_params = metric_params, n_jobs = n_jobs)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def knn_regressor(data, outputs, n_neighbors = 5, test_size = 0.3, weights = "uniform", algorithm = "auto", leaf_size = 30, p = 2, metric = "minkowski", metric_params = None, n_jobs = None):
data_train, data_test, outputs_train, outputs_test = sklearn.model_selection.train_test_split(data, outputs, test_size=test_size, random_state=1)
model = sklearn.neighbors.KNeighborsRegressor(n_neighbors = n_neighbors, weights = weights, algorithm = algorithm, leaf_size = leaf_size, p = p, metric = metric, metric_params = metric_params, n_jobs = n_jobs)
model.fit(data_train, outputs_train)
predictions = model.predict(data_test)
return model, RegressionMetric.RegressionMetric(predictions, outputs_test)

View File

@ -0,0 +1,25 @@
# Only included for backwards compatibility! Do not update, NaiveBayes is preferred and supported.
import sklearn
from sklearn import model_selection, neighbors
from . import ClassificationMetric, RegressionMetric
class KNN:
def knn_classifier(self, data, labels, n_neighbors, test_size = 0.3, algorithm='auto', leaf_size=30, metric='minkowski', metric_params=None, n_jobs=None, p=2, weights='uniform'): #expects *2d data and 1d labels post-scaling
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.neighbors.KNeighborsClassifier()
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def knn_regressor(self, data, outputs, n_neighbors, test_size = 0.3, weights = "uniform", algorithm = "auto", leaf_size = 30, p = 2, metric = "minkowski", metric_params = None, n_jobs = None):
data_train, data_test, outputs_train, outputs_test = sklearn.model_selection.train_test_split(data, outputs, test_size=test_size, random_state=1)
model = sklearn.neighbors.KNeighborsRegressor(n_neighbors = n_neighbors, weights = weights, algorithm = algorithm, leaf_size = leaf_size, p = p, metric = metric, metric_params = metric_params, n_jobs = n_jobs)
model.fit(data_train, outputs_train)
predictions = model.predict(data_test)
return model, RegressionMetric(predictions, outputs_test)

View File

@ -0,0 +1,64 @@
# Titan Robotics Team 2022: NaiveBayes submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import NaiveBayes'
# setup:
__version__ = "1.0.0"
__changelog__ = """changelog:
1.0.0:
- ported analysis.NaiveBayes() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
'gaussian',
'multinomial'
'bernoulli',
'complement'
]
import sklearn
from sklearn import model_selection, naive_bayes
from . import ClassificationMetric, RegressionMetric
def gaussian(data, labels, test_size = 0.3, priors = None, var_smoothing = 1e-09):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.GaussianNB(priors = priors, var_smoothing = var_smoothing)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def multinomial(data, labels, test_size = 0.3, alpha=1.0, fit_prior=True, class_prior=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.MultinomialNB(alpha = alpha, fit_prior = fit_prior, class_prior = class_prior)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def bernoulli(data, labels, test_size = 0.3, alpha=1.0, binarize=0.0, fit_prior=True, class_prior=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.BernoulliNB(alpha = alpha, binarize = binarize, fit_prior = fit_prior, class_prior = class_prior)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def complement(data, labels, test_size = 0.3, alpha=1.0, fit_prior=True, class_prior=None, norm=False):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.ComplementNB(alpha = alpha, fit_prior = fit_prior, class_prior = class_prior, norm = norm)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)

View File

@ -0,0 +1,43 @@
# Only included for backwards compatibility! Do not update, NaiveBayes is preferred and supported.
import sklearn
from sklearn import model_selection, naive_bayes
from . import ClassificationMetric, RegressionMetric
class NaiveBayes:
def guassian(self, data, labels, test_size = 0.3, priors = None, var_smoothing = 1e-09):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.GaussianNB(priors = priors, var_smoothing = var_smoothing)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def multinomial(self, data, labels, test_size = 0.3, alpha=1.0, fit_prior=True, class_prior=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.MultinomialNB(alpha = alpha, fit_prior = fit_prior, class_prior = class_prior)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def bernoulli(self, data, labels, test_size = 0.3, alpha=1.0, binarize=0.0, fit_prior=True, class_prior=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.BernoulliNB(alpha = alpha, binarize = binarize, fit_prior = fit_prior, class_prior = class_prior)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)
def complement(self, data, labels, test_size = 0.3, alpha=1.0, fit_prior=True, class_prior=None, norm=False):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
model = sklearn.naive_bayes.ComplementNB(alpha = alpha, fit_prior = fit_prior, class_prior = class_prior, norm = norm)
model.fit(data_train, labels_train)
predictions = model.predict(data_test)
return model, ClassificationMetric(predictions, labels_test)

View File

@ -0,0 +1,46 @@
# Titan Robotics Team 2022: RandomForest submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import RandomForest'
# setup:
__version__ = "1.0.1"
__changelog__ = """changelog:
1.0.1:
- fixed __all__
1.0.0:
- ported analysis.RandomFores() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
"random_forest_classifier",
"random_forest_regressor",
]
import sklearn
from sklearn import ensemble, model_selection
from . import ClassificationMetric, RegressionMetric
def random_forest_classifier(data, labels, test_size, n_estimators, criterion="gini", max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features="auto", max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
kernel = sklearn.ensemble.RandomForestClassifier(n_estimators = n_estimators, criterion = criterion, max_depth = max_depth, min_samples_split = min_samples_split, min_samples_leaf = min_samples_leaf, min_weight_fraction_leaf = min_weight_fraction_leaf, max_leaf_nodes = max_leaf_nodes, min_impurity_decrease = min_impurity_decrease, bootstrap = bootstrap, oob_score = oob_score, n_jobs = n_jobs, random_state = random_state, verbose = verbose, warm_start = warm_start, class_weight = class_weight)
kernel.fit(data_train, labels_train)
predictions = kernel.predict(data_test)
return kernel, ClassificationMetric(predictions, labels_test)
def random_forest_regressor(data, outputs, test_size, n_estimators, criterion="mse", max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features="auto", max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False):
data_train, data_test, outputs_train, outputs_test = sklearn.model_selection.train_test_split(data, outputs, test_size=test_size, random_state=1)
kernel = sklearn.ensemble.RandomForestRegressor(n_estimators = n_estimators, criterion = criterion, max_depth = max_depth, min_samples_split = min_samples_split, min_weight_fraction_leaf = min_weight_fraction_leaf, max_features = max_features, max_leaf_nodes = max_leaf_nodes, min_impurity_decrease = min_impurity_decrease, min_impurity_split = min_impurity_split, bootstrap = bootstrap, oob_score = oob_score, n_jobs = n_jobs, random_state = random_state, verbose = verbose, warm_start = warm_start)
kernel.fit(data_train, outputs_train)
predictions = kernel.predict(data_test)
return kernel, RegressionMetric.RegressionMetric(predictions, outputs_test)

View File

@ -0,0 +1,25 @@
# Only included for backwards compatibility! Do not update, RandomForest is preferred and supported.
import sklearn
from sklearn import ensemble, model_selection
from . import ClassificationMetric, RegressionMetric
class RandomForest:
def random_forest_classifier(self, data, labels, test_size, n_estimators, criterion="gini", max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features="auto", max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None):
data_train, data_test, labels_train, labels_test = sklearn.model_selection.train_test_split(data, labels, test_size=test_size, random_state=1)
kernel = sklearn.ensemble.RandomForestClassifier(n_estimators = n_estimators, criterion = criterion, max_depth = max_depth, min_samples_split = min_samples_split, min_samples_leaf = min_samples_leaf, min_weight_fraction_leaf = min_weight_fraction_leaf, max_leaf_nodes = max_leaf_nodes, min_impurity_decrease = min_impurity_decrease, bootstrap = bootstrap, oob_score = oob_score, n_jobs = n_jobs, random_state = random_state, verbose = verbose, warm_start = warm_start, class_weight = class_weight)
kernel.fit(data_train, labels_train)
predictions = kernel.predict(data_test)
return kernel, ClassificationMetric(predictions, labels_test)
def random_forest_regressor(self, data, outputs, test_size, n_estimators, criterion="mse", max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features="auto", max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False):
data_train, data_test, outputs_train, outputs_test = sklearn.model_selection.train_test_split(data, outputs, test_size=test_size, random_state=1)
kernel = sklearn.ensemble.RandomForestRegressor(n_estimators = n_estimators, criterion = criterion, max_depth = max_depth, min_samples_split = min_samples_split, min_weight_fraction_leaf = min_weight_fraction_leaf, max_features = max_features, max_leaf_nodes = max_leaf_nodes, min_impurity_decrease = min_impurity_decrease, min_impurity_split = min_impurity_split, bootstrap = bootstrap, oob_score = oob_score, n_jobs = n_jobs, random_state = random_state, verbose = verbose, warm_start = warm_start)
kernel.fit(data_train, outputs_train)
predictions = kernel.predict(data_test)
return kernel, RegressionMetric(predictions, outputs_test)

View File

@ -0,0 +1,42 @@
# Titan Robotics Team 2022: RegressionMetric submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import RegressionMetric'
# setup:
__version__ = "1.0.0"
__changelog__ = """changelog:
1.0.0:
- ported analysis.RegressionMetric() here
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
'RegressionMetric'
]
import numpy as np
import sklearn
from sklearn import metrics
class RegressionMetric():
def __new__(cls, predictions, targets):
return cls.r_squared(cls, predictions, targets), cls.mse(cls, predictions, targets), cls.rms(cls, predictions, targets)
def r_squared(self, predictions, targets): # assumes equal size inputs
return sklearn.metrics.r2_score(targets, predictions)
def mse(self, predictions, targets):
return sklearn.metrics.mean_squared_error(targets, predictions)
def rms(self, predictions, targets):
return np.sqrt(sklearn.metrics.mean_squared_error(targets, predictions))

View File

@ -0,0 +1,88 @@
# Titan Robotics Team 2022: SVM submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import SVM'
# setup:
__version__ = "1.0.2"
__changelog__ = """changelog:
1.0.2:
- fixed __all__
1.0.1:
- removed unessasary self calls
- removed classness
1.0.0:
- ported analysis.SVM() here
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
"CustomKernel",
"StandardKernel",
"PrebuiltKernel",
"fit",
"eval_classification",
"eval_regression",
]
import sklearn
from sklearn import svm
from . import ClassificationMetric, RegressionMetric
class CustomKernel:
def __new__(cls, C, kernel, degre, gamma, coef0, shrinking, probability, tol, cache_size, class_weight, verbose, max_iter, decision_function_shape, random_state):
return sklearn.svm.SVC(C = C, kernel = kernel, gamma = gamma, coef0 = coef0, shrinking = shrinking, probability = probability, tol = tol, cache_size = cache_size, class_weight = class_weight, verbose = verbose, max_iter = max_iter, decision_function_shape = decision_function_shape, random_state = random_state)
class StandardKernel:
def __new__(cls, kernel, C=1.0, degree=3, gamma='auto_deprecated', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', random_state=None):
return sklearn.svm.SVC(C = C, kernel = kernel, gamma = gamma, coef0 = coef0, shrinking = shrinking, probability = probability, tol = tol, cache_size = cache_size, class_weight = class_weight, verbose = verbose, max_iter = max_iter, decision_function_shape = decision_function_shape, random_state = random_state)
class PrebuiltKernel:
class Linear:
def __new__(cls):
return sklearn.svm.SVC(kernel = 'linear')
class Polynomial:
def __new__(cls, power, r_bias):
return sklearn.svm.SVC(kernel = 'polynomial', degree = power, coef0 = r_bias)
class RBF:
def __new__(cls, gamma):
return sklearn.svm.SVC(kernel = 'rbf', gamma = gamma)
class Sigmoid:
def __new__(cls, r_bias):
return sklearn.svm.SVC(kernel = 'sigmoid', coef0 = r_bias)
def fit(kernel, train_data, train_outputs): # expects *2d data, 1d labels or outputs
return kernel.fit(train_data, train_outputs)
def eval_classification(kernel, test_data, test_outputs):
predictions = kernel.predict(test_data)
return ClassificationMetric(predictions, test_outputs)
def eval_regression(kernel, test_data, test_outputs):
predictions = kernel.predict(test_data)
return RegressionMetric(predictions, test_outputs)

View File

@ -0,0 +1,424 @@
# Titan Robotics Team 2022: Sort submodule
# Written by Arthur Lu and James Pan
# Notes:
# this should be imported as a python module using 'from tra_analysis import Sort'
# setup:
__version__ = "1.0.1"
__changelog__ = """changelog:
1.0.1:
- fixed __all__
1.0.0:
- ported analysis.Sort() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
"James Pan <zpan@imsa.edu>"
)
__all__ = [
"quicksort",
"mergesort",
"introsort",
"heapsort",
"insertionsort",
"timsort",
"selectionsort",
"shellsort",
"bubblesort",
"cyclesort",
"cocktailsort",
]
import numpy as np
def quicksort(a):
def sort(array):
less = []
equal = []
greater = []
if len(array) > 1:
pivot = array[0]
for x in array:
if x < pivot:
less.append(x)
elif x == pivot:
equal.append(x)
elif x > pivot:
greater.append(x)
return sort(less)+equal+sort(greater)
else:
return array
return np.array(sort(a))
def mergesort(a):
def sort(array):
array = array
if len(array) >1:
middle = len(array) // 2
L = array[:middle]
R = array[middle:]
sort(L)
sort(R)
i = j = k = 0
while i < len(L) and j < len(R):
if L[i] < R[j]:
array[k] = L[i]
i+= 1
else:
array[k] = R[j]
j+= 1
k+= 1
while i < len(L):
array[k] = L[i]
i+= 1
k+= 1
while j < len(R):
array[k] = R[j]
j+= 1
k+= 1
return array
return sort(a)
def introsort(a):
def sort(array, start, end, maxdepth):
array = array
if end - start <= 1:
return
elif maxdepth == 0:
heapsort(array, start, end)
else:
p = partition(array, start, end)
sort(array, start, p + 1, maxdepth - 1)
sort(array, p + 1, end, maxdepth - 1)
return array
def partition(array, start, end):
pivot = array[start]
i = start - 1
j = end
while True:
i = i + 1
while array[i] < pivot:
i = i + 1
j = j - 1
while array[j] > pivot:
j = j - 1
if i >= j:
return j
swap(array, i, j)
def swap(array, i, j):
array[i], array[j] = array[j], array[i]
def heapsort(array, start, end):
build_max_heap(array, start, end)
for i in range(end - 1, start, -1):
swap(array, start, i)
max_heapify(array, index=0, start=start, end=i)
def build_max_heap(array, start, end):
def parent(i):
return (i - 1)//2
length = end - start
index = parent(length - 1)
while index >= 0:
max_heapify(array, index, start, end)
index = index - 1
def max_heapify(array, index, start, end):
def left(i):
return 2*i + 1
def right(i):
return 2*i + 2
size = end - start
l = left(index)
r = right(index)
if (l < size and array[start + l] > array[start + index]):
largest = l
else:
largest = index
if (r < size and array[start + r] > array[start + largest]):
largest = r
if largest != index:
swap(array, start + largest, start + index)
max_heapify(array, largest, start, end)
maxdepth = (len(a).bit_length() - 1)*2
return sort(a, 0, len(a), maxdepth)
def heapsort(a):
def sort(array):
array = array
n = len(array)
for i in range(n//2 - 1, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array
def heapify(array, n, i):
array = array
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r
if largest != i:
array[i],array[largest] = array[largest],array[i]
heapify(array, n, largest)
return array
return sort(a)
def insertionsort(a):
def sort(array):
array = array
for i in range(1, len(array)):
key = array[i]
j = i-1
while j >=0 and key < array[j] :
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array
return sort(a)
def timsort(a, block = 32):
BLOCK = block
def sort(array, n):
array = array
for i in range(0, n, BLOCK):
insertionsort(array, i, min((i+31), (n-1)))
size = BLOCK
while size < n:
for left in range(0, n, 2*size):
mid = left + size - 1
right = min((left + 2*size - 1), (n-1))
merge(array, left, mid, right)
size = 2*size
return array
def insertionsort(array, left, right):
array = array
for i in range(left + 1, right+1):
temp = array[i]
j = i - 1
while j >= left and array[j] > temp :
array[j+1] = array[j]
j -= 1
array[j+1] = temp
return array
def merge(array, l, m, r):
len1, len2 = m - l + 1, r - m
left, right = [], []
for i in range(0, len1):
left.append(array[l + i])
for i in range(0, len2):
right.append(array[m + 1 + i])
i, j, k = 0, 0, l
while i < len1 and j < len2:
if left[i] <= right[j]:
array[k] = left[i]
i += 1
else:
array[k] = right[j]
j += 1
k += 1
while i < len1:
array[k] = left[i]
k += 1
i += 1
while j < len2:
array[k] = right[j]
k += 1
j += 1
return sort(a, len(a))
def selectionsort(a):
array = a
for i in range(len(array)):
min_idx = i
for j in range(i+1, len(array)):
if array[min_idx] > array[j]:
min_idx = j
array[i], array[min_idx] = array[min_idx], array[i]
return array
def shellsort(a):
array = a
n = len(array)
gap = n//2
while gap > 0:
for i in range(gap,n):
temp = array[i]
j = i
while j >= gap and array[j-gap] >temp:
array[j] = array[j-gap]
j -= gap
array[j] = temp
gap //= 2
return array
def bubblesort(a):
def sort(array):
for i, num in enumerate(array):
try:
if array[i+1] < num:
array[i] = array[i+1]
array[i+1] = num
sort(array)
except IndexError:
pass
return array
return sort(a)
def cyclesort(a):
def sort(array):
array = array
writes = 0
for cycleStart in range(0, len(array) - 1):
item = array[cycleStart]
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1
if pos == cycleStart:
continue
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
while pos != cycleStart:
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
return array
return sort(a)
def cocktailsort(a):
def sort(array):
array = array
n = len(array)
swapped = True
start = 0
end = n-1
while (swapped == True):
swapped = False
for i in range (start, end):
if (array[i] > array[i + 1]) :
array[i], array[i + 1]= array[i + 1], array[i]
swapped = True
if (swapped == False):
break
swapped = False
end = end-1
for i in range(end-1, start-1, -1):
if (array[i] > array[i + 1]):
array[i], array[i + 1] = array[i + 1], array[i]
swapped = True
start = start + 1
return array
return sort(a)

View File

@ -0,0 +1,391 @@
# Only included for backwards compatibility! Do not update, Sort is preferred and supported.
class Sort: # if you haven't used a sort, then you've never lived
def quicksort(self, a):
def sort(array):
less = []
equal = []
greater = []
if len(array) > 1:
pivot = array[0]
for x in array:
if x < pivot:
less.append(x)
elif x == pivot:
equal.append(x)
elif x > pivot:
greater.append(x)
return sort(less)+equal+sort(greater)
else:
return array
return np.array(sort(a))
def mergesort(self, a):
def sort(array):
array = array
if len(array) >1:
middle = len(array) // 2
L = array[:middle]
R = array[middle:]
sort(L)
sort(R)
i = j = k = 0
while i < len(L) and j < len(R):
if L[i] < R[j]:
array[k] = L[i]
i+= 1
else:
array[k] = R[j]
j+= 1
k+= 1
while i < len(L):
array[k] = L[i]
i+= 1
k+= 1
while j < len(R):
array[k] = R[j]
j+= 1
k+= 1
return array
return sort(a)
def introsort(self, a):
def sort(array, start, end, maxdepth):
array = array
if end - start <= 1:
return
elif maxdepth == 0:
heapsort(array, start, end)
else:
p = partition(array, start, end)
sort(array, start, p + 1, maxdepth - 1)
sort(array, p + 1, end, maxdepth - 1)
return array
def partition(array, start, end):
pivot = array[start]
i = start - 1
j = end
while True:
i = i + 1
while array[i] < pivot:
i = i + 1
j = j - 1
while array[j] > pivot:
j = j - 1
if i >= j:
return j
swap(array, i, j)
def swap(array, i, j):
array[i], array[j] = array[j], array[i]
def heapsort(array, start, end):
build_max_heap(array, start, end)
for i in range(end - 1, start, -1):
swap(array, start, i)
max_heapify(array, index=0, start=start, end=i)
def build_max_heap(array, start, end):
def parent(i):
return (i - 1)//2
length = end - start
index = parent(length - 1)
while index >= 0:
max_heapify(array, index, start, end)
index = index - 1
def max_heapify(array, index, start, end):
def left(i):
return 2*i + 1
def right(i):
return 2*i + 2
size = end - start
l = left(index)
r = right(index)
if (l < size and array[start + l] > array[start + index]):
largest = l
else:
largest = index
if (r < size and array[start + r] > array[start + largest]):
largest = r
if largest != index:
swap(array, start + largest, start + index)
max_heapify(array, largest, start, end)
maxdepth = (len(a).bit_length() - 1)*2
return sort(a, 0, len(a), maxdepth)
def heapsort(self, a):
def sort(array):
array = array
n = len(array)
for i in range(n//2 - 1, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array
def heapify(array, n, i):
array = array
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r
if largest != i:
array[i],array[largest] = array[largest],array[i]
heapify(array, n, largest)
return array
return sort(a)
def insertionsort(self, a):
def sort(array):
array = array
for i in range(1, len(array)):
key = array[i]
j = i-1
while j >=0 and key < array[j] :
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array
return sort(a)
def timsort(self, a, block = 32):
BLOCK = block
def sort(array, n):
array = array
for i in range(0, n, BLOCK):
insertionsort(array, i, min((i+31), (n-1)))
size = BLOCK
while size < n:
for left in range(0, n, 2*size):
mid = left + size - 1
right = min((left + 2*size - 1), (n-1))
merge(array, left, mid, right)
size = 2*size
return array
def insertionsort(array, left, right):
array = array
for i in range(left + 1, right+1):
temp = array[i]
j = i - 1
while j >= left and array[j] > temp :
array[j+1] = array[j]
j -= 1
array[j+1] = temp
return array
def merge(array, l, m, r):
len1, len2 = m - l + 1, r - m
left, right = [], []
for i in range(0, len1):
left.append(array[l + i])
for i in range(0, len2):
right.append(array[m + 1 + i])
i, j, k = 0, 0, l
while i < len1 and j < len2:
if left[i] <= right[j]:
array[k] = left[i]
i += 1
else:
array[k] = right[j]
j += 1
k += 1
while i < len1:
array[k] = left[i]
k += 1
i += 1
while j < len2:
array[k] = right[j]
k += 1
j += 1
return sort(a, len(a))
def selectionsort(self, a):
array = a
for i in range(len(array)):
min_idx = i
for j in range(i+1, len(array)):
if array[min_idx] > array[j]:
min_idx = j
array[i], array[min_idx] = array[min_idx], array[i]
return array
def shellsort(self, a):
array = a
n = len(array)
gap = n//2
while gap > 0:
for i in range(gap,n):
temp = array[i]
j = i
while j >= gap and array[j-gap] >temp:
array[j] = array[j-gap]
j -= gap
array[j] = temp
gap //= 2
return array
def bubblesort(self, a):
def sort(array):
for i, num in enumerate(array):
try:
if array[i+1] < num:
array[i] = array[i+1]
array[i+1] = num
sort(array)
except IndexError:
pass
return array
return sort(a)
def cyclesort(self, a):
def sort(array):
array = array
writes = 0
for cycleStart in range(0, len(array) - 1):
item = array[cycleStart]
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1
if pos == cycleStart:
continue
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
while pos != cycleStart:
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1
while item == array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1
return array
return sort(a)
def cocktailsort(self, a):
def sort(array):
array = array
n = len(array)
swapped = True
start = 0
end = n-1
while (swapped == True):
swapped = False
for i in range (start, end):
if (array[i] > array[i + 1]) :
array[i], array[i + 1]= array[i + 1], array[i]
swapped = True
if (swapped == False):
break
swapped = False
end = end-1
for i in range(end-1, start-1, -1):
if (array[i] > array[i + 1]):
array[i], array[i + 1] = array[i + 1], array[i]
swapped = True
start = start + 1
return array
return sort(a)

View File

@ -0,0 +1,314 @@
# Titan Robotics Team 2022: StatisticalTest submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import StatisticalTest'
# setup:
__version__ = "1.0.2"
__changelog__ = """changelog:
1.0.2:
- added tukey_multicomparison
- fixed styling
1.0.1:
- fixed typo in __all__
1.0.0:
- ported analysis.StatisticalTest() here
- removed classness
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
"James Pan <zpan@imsa.edu>",
)
__all__ = [
'ttest_onesample',
'ttest_independent',
'ttest_statistic',
'ttest_related',
'ks_fitness',
'chisquare',
'powerdivergence'
'ks_twosample',
'es_twosample',
'mw_rank',
'mw_tiecorrection',
'rankdata',
'wilcoxon_ranksum',
'wilcoxon_signedrank',
'kw_htest',
'friedman_chisquare',
'bm_wtest',
'combine_pvalues',
'jb_fitness',
'ab_equality',
'bartlett_variance',
'levene_variance',
'sw_normality',
'shapiro',
'ad_onesample',
'ad_ksample',
'binomial',
'fk_variance',
'mood_mediantest',
'mood_equalscale',
'skewtest',
'kurtosistest',
'normaltest',
'tukey_multicomparison'
]
import numpy as np
import scipy
from scipy import stats, interpolate
def ttest_onesample(a, popmean, axis = 0, nan_policy = 'propagate'):
results = scipy.stats.ttest_1samp(a, popmean, axis = axis, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ttest_independent(a, b, equal = True, nan_policy = 'propagate'):
results = scipy.stats.ttest_ind(a, b, equal_var = equal, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ttest_statistic(o1, o2, equal = True):
results = scipy.stats.ttest_ind_from_stats(o1["mean"], o1["std"], o1["nobs"], o2["mean"], o2["std"], o2["nobs"], equal_var = equal)
return {"t-value": results[0], "p-value": results[1]}
def ttest_related(a, b, axis = 0, nan_policy='propagate'):
results = scipy.stats.ttest_rel(a, b, axis = axis, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ks_fitness(rvs, cdf, args = (), N = 20, alternative = 'two-sided', mode = 'approx'):
results = scipy.stats.kstest(rvs, cdf, args = args, N = N, alternative = alternative, mode = mode)
return {"ks-value": results[0], "p-value": results[1]}
def chisquare(f_obs, f_exp = None, ddof = None, axis = 0):
results = scipy.stats.chisquare(f_obs, f_exp = f_exp, ddof = ddof, axis = axis)
return {"chisquared-value": results[0], "p-value": results[1]}
def powerdivergence(f_obs, f_exp = None, ddof = None, axis = 0, lambda_ = None):
results = scipy.stats.power_divergence(f_obs, f_exp = f_exp, ddof = ddof, axis = axis, lambda_ = lambda_)
return {"powerdivergence-value": results[0], "p-value": results[1]}
def ks_twosample(x, y, alternative = 'two_sided', mode = 'auto'):
results = scipy.stats.ks_2samp(x, y, alternative = alternative, mode = mode)
return {"ks-value": results[0], "p-value": results[1]}
def es_twosample(x, y, t = (0.4, 0.8)):
results = scipy.stats.epps_singleton_2samp(x, y, t = t)
return {"es-value": results[0], "p-value": results[1]}
def mw_rank(x, y, use_continuity = True, alternative = None):
results = scipy.stats.mannwhitneyu(x, y, use_continuity = use_continuity, alternative = alternative)
return {"u-value": results[0], "p-value": results[1]}
def mw_tiecorrection(rank_values):
results = scipy.stats.tiecorrect(rank_values)
return {"correction-factor": results}
def rankdata(a, method = 'average'):
results = scipy.stats.rankdata(a, method = method)
return results
def wilcoxon_ranksum(a, b): # this seems to be superceded by Mann Whitney Wilcoxon U Test
results = scipy.stats.ranksums(a, b)
return {"u-value": results[0], "p-value": results[1]}
def wilcoxon_signedrank(x, y = None, zero_method = 'wilcox', correction = False, alternative = 'two-sided'):
results = scipy.stats.wilcoxon(x, y = y, zero_method = zero_method, correction = correction, alternative = alternative)
return {"t-value": results[0], "p-value": results[1]}
def kw_htest(*args, nan_policy = 'propagate'):
results = scipy.stats.kruskal(*args, nan_policy = nan_policy)
return {"h-value": results[0], "p-value": results[1]}
def friedman_chisquare(*args):
results = scipy.stats.friedmanchisquare(*args)
return {"chisquared-value": results[0], "p-value": results[1]}
def bm_wtest(x, y, alternative = 'two-sided', distribution = 't', nan_policy = 'propagate'):
results = scipy.stats.brunnermunzel(x, y, alternative = alternative, distribution = distribution, nan_policy = nan_policy)
return {"w-value": results[0], "p-value": results[1]}
def combine_pvalues(pvalues, method = 'fisher', weights = None):
results = scipy.stats.combine_pvalues(pvalues, method = method, weights = weights)
return {"combined-statistic": results[0], "p-value": results[1]}
def jb_fitness(x):
results = scipy.stats.jarque_bera(x)
return {"jb-value": results[0], "p-value": results[1]}
def ab_equality(x, y):
results = scipy.stats.ansari(x, y)
return {"ab-value": results[0], "p-value": results[1]}
def bartlett_variance(*args):
results = scipy.stats.bartlett(*args)
return {"t-value": results[0], "p-value": results[1]}
def levene_variance(*args, center = 'median', proportiontocut = 0.05):
results = scipy.stats.levene(*args, center = center, proportiontocut = proportiontocut)
return {"w-value": results[0], "p-value": results[1]}
def sw_normality(x):
results = scipy.stats.shapiro(x)
return {"w-value": results[0], "p-value": results[1]}
def shapiro(x):
return "destroyed by facts and logic"
def ad_onesample(x, dist = 'norm'):
results = scipy.stats.anderson(x, dist = dist)
return {"d-value": results[0], "critical-values": results[1], "significance-value": results[2]}
def ad_ksample(samples, midrank = True):
results = scipy.stats.anderson_ksamp(samples, midrank = midrank)
return {"d-value": results[0], "critical-values": results[1], "significance-value": results[2]}
def binomial(x, n = None, p = 0.5, alternative = 'two-sided'):
results = scipy.stats.binom_test(x, n = n, p = p, alternative = alternative)
return {"p-value": results}
def fk_variance(*args, center = 'median', proportiontocut = 0.05):
results = scipy.stats.fligner(*args, center = center, proportiontocut = proportiontocut)
return {"h-value": results[0], "p-value": results[1]} # unknown if the statistic is an h value
def mood_mediantest(*args, ties = 'below', correction = True, lambda_ = 1, nan_policy = 'propagate'):
results = scipy.stats.median_test(*args, ties = ties, correction = correction, lambda_ = lambda_, nan_policy = nan_policy)
return {"chisquared-value": results[0], "p-value": results[1], "m-value": results[2], "table": results[3]}
def mood_equalscale(x, y, axis = 0):
results = scipy.stats.mood(x, y, axis = axis)
return {"z-score": results[0], "p-value": results[1]}
def skewtest(a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.skewtest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}
def kurtosistest(a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.kurtosistest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}
def normaltest(a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.normaltest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}
def get_tukeyQcrit(k, df, alpha=0.05):
'''
From statsmodels.sandbox.stats.multicomp
return critical values for Tukey's HSD (Q)
Parameters
----------
k : int in {2, ..., 10}
number of tests
df : int
degrees of freedom of error term
alpha : {0.05, 0.01}
type 1 error, 1-confidence level
not enough error checking for limitations
'''
# qtable from statsmodels.sandbox.stats.multicomp
qcrit = '''
2 3 4 5 6 7 8 9 10
5 3.64 5.70 4.60 6.98 5.22 7.80 5.67 8.42 6.03 8.91 6.33 9.32 6.58 9.67 6.80 9.97 6.99 10.24
6 3.46 5.24 4.34 6.33 4.90 7.03 5.30 7.56 5.63 7.97 5.90 8.32 6.12 8.61 6.32 8.87 6.49 9.10
7 3.34 4.95 4.16 5.92 4.68 6.54 5.06 7.01 5.36 7.37 5.61 7.68 5.82 7.94 6.00 8.17 6.16 8.37
8 3.26 4.75 4.04 5.64 4.53 6.20 4.89 6.62 5.17 6.96 5.40 7.24 5.60 7.47 5.77 7.68 5.92 7.86
9 3.20 4.60 3.95 5.43 4.41 5.96 4.76 6.35 5.02 6.66 5.24 6.91 5.43 7.13 5.59 7.33 5.74 7.49
10 3.15 4.48 3.88 5.27 4.33 5.77 4.65 6.14 4.91 6.43 5.12 6.67 5.30 6.87 5.46 7.05 5.60 7.21
11 3.11 4.39 3.82 5.15 4.26 5.62 4.57 5.97 4.82 6.25 5.03 6.48 5.20 6.67 5.35 6.84 5.49 6.99
12 3.08 4.32 3.77 5.05 4.20 5.50 4.51 5.84 4.75 6.10 4.95 6.32 5.12 6.51 5.27 6.67 5.39 6.81
13 3.06 4.26 3.73 4.96 4.15 5.40 4.45 5.73 4.69 5.98 4.88 6.19 5.05 6.37 5.19 6.53 5.32 6.67
14 3.03 4.21 3.70 4.89 4.11 5.32 4.41 5.63 4.64 5.88 4.83 6.08 4.99 6.26 5.13 6.41 5.25 6.54
15 3.01 4.17 3.67 4.84 4.08 5.25 4.37 5.56 4.59 5.80 4.78 5.99 4.94 6.16 5.08 6.31 5.20 6.44
16 3.00 4.13 3.65 4.79 4.05 5.19 4.33 5.49 4.56 5.72 4.74 5.92 4.90 6.08 5.03 6.22 5.15 6.35
17 2.98 4.10 3.63 4.74 4.02 5.14 4.30 5.43 4.52 5.66 4.70 5.85 4.86 6.01 4.99 6.15 5.11 6.27
18 2.97 4.07 3.61 4.70 4.00 5.09 4.28 5.38 4.49 5.60 4.67 5.79 4.82 5.94 4.96 6.08 5.07 6.20
19 2.96 4.05 3.59 4.67 3.98 5.05 4.25 5.33 4.47 5.55 4.65 5.73 4.79 5.89 4.92 6.02 5.04 6.14
20 2.95 4.02 3.58 4.64 3.96 5.02 4.23 5.29 4.45 5.51 4.62 5.69 4.77 5.84 4.90 5.97 5.01 6.09
24 2.92 3.96 3.53 4.55 3.90 4.91 4.17 5.17 4.37 5.37 4.54 5.54 4.68 5.69 4.81 5.81 4.92 5.92
30 2.89 3.89 3.49 4.45 3.85 4.80 4.10 5.05 4.30 5.24 4.46 5.40 4.60 5.54 4.72 5.65 4.82 5.76
40 2.86 3.82 3.44 4.37 3.79 4.70 4.04 4.93 4.23 5.11 4.39 5.26 4.52 5.39 4.63 5.50 4.73 5.60
60 2.83 3.76 3.40 4.28 3.74 4.59 3.98 4.82 4.16 4.99 4.31 5.13 4.44 5.25 4.55 5.36 4.65 5.45
120 2.80 3.70 3.36 4.20 3.68 4.50 3.92 4.71 4.10 4.87 4.24 5.01 4.36 5.12 4.47 5.21 4.56 5.30
infinity 2.77 3.64 3.31 4.12 3.63 4.40 3.86 4.60 4.03 4.76 4.17 4.88 4.29 4.99 4.39 5.08 4.47 5.16
'''
res = [line.split() for line in qcrit.replace('infinity','9999').split('\n')]
c=np.array(res[2:-1]).astype(float)
#c[c==9999] = np.inf
ccols = np.arange(2,11)
crows = c[:,0]
cv005 = c[:, 1::2]
cv001 = c[:, 2::2]
if alpha == 0.05:
intp = interpolate.interp1d(crows, cv005[:,k-2])
elif alpha == 0.01:
intp = interpolate.interp1d(crows, cv001[:,k-2])
else:
raise ValueError('only implemented for alpha equal to 0.01 and 0.05')
return intp(df)
def tukey_multicomparison(groups, alpha=0.05):
#formulas according to https://astatsa.com/OneWay_Anova_with_TukeyHSD/
k = len(groups)
df = 0
means = []
MSE = 0
for group in groups:
df+= len(group)
mean = sum(group)/len(group)
means.append(mean)
MSE += sum([(i-mean)**2 for i in group])
df -= k
MSE /= df
q_dict = {}
crit_q = get_tukeyQcrit(k, df, alpha)
for i in range(k-1):
for j in range(i+1, k):
numerator = abs(means[i] - means[j])
denominator = np.sqrt( MSE / ( 2/(1/len(groups[i]) + 1/len(groups[j])) ))
q = numerator/denominator
q_dict["group "+ str(i+1) + " and group " + str(j+1)] = [q, q>crit_q]
return q_dict

View File

@ -0,0 +1,170 @@
# Only included for backwards compatibility! Do not update, StatisticalTest is preferred and supported.
import scipy
from scipy import stats
class StatisticalTest:
def ttest_onesample(self, a, popmean, axis = 0, nan_policy = 'propagate'):
results = scipy.stats.ttest_1samp(a, popmean, axis = axis, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ttest_independent(self, a, b, equal = True, nan_policy = 'propagate'):
results = scipy.stats.ttest_ind(a, b, equal_var = equal, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ttest_statistic(self, o1, o2, equal = True):
results = scipy.stats.ttest_ind_from_stats(o1["mean"], o1["std"], o1["nobs"], o2["mean"], o2["std"], o2["nobs"], equal_var = equal)
return {"t-value": results[0], "p-value": results[1]}
def ttest_related(self, a, b, axis = 0, nan_policy='propagate'):
results = scipy.stats.ttest_rel(a, b, axis = axis, nan_policy = nan_policy)
return {"t-value": results[0], "p-value": results[1]}
def ks_fitness(self, rvs, cdf, args = (), N = 20, alternative = 'two-sided', mode = 'approx'):
results = scipy.stats.kstest(rvs, cdf, args = args, N = N, alternative = alternative, mode = mode)
return {"ks-value": results[0], "p-value": results[1]}
def chisquare(self, f_obs, f_exp = None, ddof = None, axis = 0):
results = scipy.stats.chisquare(f_obs, f_exp = f_exp, ddof = ddof, axis = axis)
return {"chisquared-value": results[0], "p-value": results[1]}
def powerdivergence(self, f_obs, f_exp = None, ddof = None, axis = 0, lambda_ = None):
results = scipy.stats.power_divergence(f_obs, f_exp = f_exp, ddof = ddof, axis = axis, lambda_ = lambda_)
return {"powerdivergence-value": results[0], "p-value": results[1]}
def ks_twosample(self, x, y, alternative = 'two_sided', mode = 'auto'):
results = scipy.stats.ks_2samp(x, y, alternative = alternative, mode = mode)
return {"ks-value": results[0], "p-value": results[1]}
def es_twosample(self, x, y, t = (0.4, 0.8)):
results = scipy.stats.epps_singleton_2samp(x, y, t = t)
return {"es-value": results[0], "p-value": results[1]}
def mw_rank(self, x, y, use_continuity = True, alternative = None):
results = scipy.stats.mannwhitneyu(x, y, use_continuity = use_continuity, alternative = alternative)
return {"u-value": results[0], "p-value": results[1]}
def mw_tiecorrection(self, rank_values):
results = scipy.stats.tiecorrect(rank_values)
return {"correction-factor": results}
def rankdata(self, a, method = 'average'):
results = scipy.stats.rankdata(a, method = method)
return results
def wilcoxon_ranksum(self, a, b): # this seems to be superceded by Mann Whitney Wilcoxon U Test
results = scipy.stats.ranksums(a, b)
return {"u-value": results[0], "p-value": results[1]}
def wilcoxon_signedrank(self, x, y = None, zero_method = 'wilcox', correction = False, alternative = 'two-sided'):
results = scipy.stats.wilcoxon(x, y = y, zero_method = zero_method, correction = correction, alternative = alternative)
return {"t-value": results[0], "p-value": results[1]}
def kw_htest(self, *args, nan_policy = 'propagate'):
results = scipy.stats.kruskal(*args, nan_policy = nan_policy)
return {"h-value": results[0], "p-value": results[1]}
def friedman_chisquare(self, *args):
results = scipy.stats.friedmanchisquare(*args)
return {"chisquared-value": results[0], "p-value": results[1]}
def bm_wtest(self, x, y, alternative = 'two-sided', distribution = 't', nan_policy = 'propagate'):
results = scipy.stats.brunnermunzel(x, y, alternative = alternative, distribution = distribution, nan_policy = nan_policy)
return {"w-value": results[0], "p-value": results[1]}
def combine_pvalues(self, pvalues, method = 'fisher', weights = None):
results = scipy.stats.combine_pvalues(pvalues, method = method, weights = weights)
return {"combined-statistic": results[0], "p-value": results[1]}
def jb_fitness(self, x):
results = scipy.stats.jarque_bera(x)
return {"jb-value": results[0], "p-value": results[1]}
def ab_equality(self, x, y):
results = scipy.stats.ansari(x, y)
return {"ab-value": results[0], "p-value": results[1]}
def bartlett_variance(self, *args):
results = scipy.stats.bartlett(*args)
return {"t-value": results[0], "p-value": results[1]}
def levene_variance(self, *args, center = 'median', proportiontocut = 0.05):
results = scipy.stats.levene(*args, center = center, proportiontocut = proportiontocut)
return {"w-value": results[0], "p-value": results[1]}
def sw_normality(self, x):
results = scipy.stats.shapiro(x)
return {"w-value": results[0], "p-value": results[1]}
def shapiro(self, x):
return "destroyed by facts and logic"
def ad_onesample(self, x, dist = 'norm'):
results = scipy.stats.anderson(x, dist = dist)
return {"d-value": results[0], "critical-values": results[1], "significance-value": results[2]}
def ad_ksample(self, samples, midrank = True):
results = scipy.stats.anderson_ksamp(samples, midrank = midrank)
return {"d-value": results[0], "critical-values": results[1], "significance-value": results[2]}
def binomial(self, x, n = None, p = 0.5, alternative = 'two-sided'):
results = scipy.stats.binom_test(x, n = n, p = p, alternative = alternative)
return {"p-value": results}
def fk_variance(self, *args, center = 'median', proportiontocut = 0.05):
results = scipy.stats.fligner(*args, center = center, proportiontocut = proportiontocut)
return {"h-value": results[0], "p-value": results[1]} # unknown if the statistic is an h value
def mood_mediantest(self, *args, ties = 'below', correction = True, lambda_ = 1, nan_policy = 'propagate'):
results = scipy.stats.median_test(*args, ties = ties, correction = correction, lambda_ = lambda_, nan_policy = nan_policy)
return {"chisquared-value": results[0], "p-value": results[1], "m-value": results[2], "table": results[3]}
def mood_equalscale(self, x, y, axis = 0):
results = scipy.stats.mood(x, y, axis = axis)
return {"z-score": results[0], "p-value": results[1]}
def skewtest(self, a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.skewtest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}
def kurtosistest(self, a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.kurtosistest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}
def normaltest(self, a, axis = 0, nan_policy = 'propogate'):
results = scipy.stats.normaltest(a, axis = axis, nan_policy = nan_policy)
return {"z-score": results[0], "p-value": results[1]}

View File

@ -0,0 +1,68 @@
# Titan Robotics Team 2022: tra_analysis package
# Written by Arthur Lu, Jacob Levine, Dev Singh, and James Pan
# Notes:
# this should be imported as a python package using 'import tra_analysis'
# this should be included in the local directory or environment variable
# this module has been optimized for multhreaded computing
# current benchmark of optimization: 1.33 times faster
# setup:
__version__ = "3.0.0"
# changelog should be viewed using print(analysis.__changelog__)
__changelog__ = """changelog:
3.0.0:
- incremented version to release 3.0.0
3.0.0-rc2:
- fixed __changelog__
- fixed __all__ of Analysis, Array, ClassificationMetric, CorrelationTest, RandomForest, Sort, SVM
- populated __all__
3.0.0-alpha.4:
- changed version to 3 because of significant changes
- added backwards compatibility import of analysis
2.1.0-alpha.3:
- fixed indentation in meta data
2.1.0-alpha.2:
- updated SVM import
2.1.0-alpha.1:
- moved multiple submodules under analysis to their own modules/files
- added header, __version__, __changelog__, __author__, __all__ (unpopulated)
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
"Jacob Levine <jlevine@imsa.edu>",
"Dev Singh <dev@devksingh.com>",
"James Pan <zpan@imsa.edu>"
)
__all__ = [
"Analysis",
"Array",
"ClassificationMetric",
"CorrelationTest",
"Expression",
"Fit",
"KNN",
"NaiveBayes",
"RandomForest",
"RegressionMetric",
"Sort",
"StatisticalTest",
"SVM"
]
from . import Analysis as Analysis
from . import Analysis as analysis
from .Array import Array
from .ClassificationMetric import ClassificationMetric
from . import CorrelationTest
from .equation import Expression
from . import Fit
from . import KNN
from . import NaiveBayes
from . import RandomForest
from .RegressionMetric import RegressionMetric
from . import Sort
from . import StatisticalTest
from . import SVM

File diff suppressed because it is too large Load Diff

View File

@ -1,162 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"from decimal import Decimal\n",
"from functools import reduce"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def add(string):\n",
" while(len(re.findall(\"[+]{1}[-]?\", string)) != 0):\n",
" string = re.sub(\"[-]?\\d+[.]?\\d*[+]{1}[-]?\\d+[.]?\\d*\", str(\"%f\" % reduce((lambda x, y: x + y), [Decimal(i) for i in re.split(\"[+]{1}\", re.search(\"[-]?\\d+[.]?\\d*[+]{1}[-]?\\d+[.]?\\d*\", string).group())])), string, 1)\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"def sub(string):\n",
" while(len(re.findall(\"\\d+[.]?\\d*[-]{1,2}\\d+[.]?\\d*\", string)) != 0):\n",
" g = re.search(\"\\d+[.]?\\d*[-]{1,2}\\d+[.]?\\d*\", string).group()\n",
" if(re.search(\"[-]{1,2}\", g).group() == \"-\"):\n",
" r = re.sub(\"[-]{1}\", \"+-\", g, 1)\n",
" string = re.sub(g, r, string, 1)\n",
" elif(re.search(\"[-]{1,2}\", g).group() == \"--\"):\n",
" r = re.sub(\"[-]{2}\", \"+\", g, 1)\n",
" string = re.sub(g, r, string, 1)\n",
" else:\n",
" pass\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"def mul(string):\n",
" while(len(re.findall(\"[*]{1}[-]?\", string)) != 0):\n",
" string = re.sub(\"[-]?\\d+[.]?\\d*[*]{1}[-]?\\d+[.]?\\d*\", str(\"%f\" % reduce((lambda x, y: x * y), [Decimal(i) for i in re.split(\"[*]{1}\", re.search(\"[-]?\\d+[.]?\\d*[*]{1}[-]?\\d+[.]?\\d*\", string).group())])), string, 1)\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def div(string):\n",
" while(len(re.findall(\"[/]{1}[-]?\", string)) != 0):\n",
" string = re.sub(\"[-]?\\d+[.]?\\d*[/]{1}[-]?\\d+[.]?\\d*\", str(\"%f\" % reduce((lambda x, y: x / y), [Decimal(i) for i in re.split(\"[/]{1}\", re.search(\"[-]?\\d+[.]?\\d*[/]{1}[-]?\\d+[.]?\\d*\", string).group())])), string, 1)\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def exp(string):\n",
" while(len(re.findall(\"[\\^]{1}[-]?\", string)) != 0):\n",
" string = re.sub(\"[-]?\\d+[.]?\\d*[\\^]{1}[-]?\\d+[.]?\\d*\", str(\"%f\" % reduce((lambda x, y: x ** y), [Decimal(i) for i in re.split(\"[\\^]{1}\", re.search(\"[-]?\\d+[.]?\\d*[\\^]{1}[-]?\\d+[.]?\\d*\", string).group())])), string, 1)\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def evaluate(string):\n",
" string = exp(string)\n",
" string = div(string)\n",
" string = mul(string)\n",
" string = sub(string)\n",
" print(string)\n",
" string = add(string)\n",
" return string"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"output_type": "error",
"ename": "SyntaxError",
"evalue": "unexpected EOF while parsing (<ipython-input-13-f9fb4aededd9>, line 1)",
"traceback": [
"\u001b[1;36m File \u001b[1;32m\"<ipython-input-13-f9fb4aededd9>\"\u001b[1;36m, line \u001b[1;32m1\u001b[0m\n\u001b[1;33m def parentheses(string):\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m unexpected EOF while parsing\n"
]
}
],
"source": [
"def parentheses(string):"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "-158456325028528675187087900672.000000+0.8\n"
},
{
"output_type": "execute_result",
"data": {
"text/plain": "'-158456325028528675187087900672.000000'"
},
"metadata": {},
"execution_count": 22
}
],
"source": [
"string = \"8^32*4/-2+0.8\"\n",
"evaluate(string)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6-final"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@ -0,0 +1,37 @@
# Titan Robotics Team 2022: Expression submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis.Equation import Expression'
# TODO:
# - add option to pick parser backend
# - fix unit tests
# setup:
__version__ = "0.0.1-alpha"
__changelog__ = """changelog:
0.0.1-alpha:
- used the HybridExpressionParser as backend for Expression
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = {
"Expression"
}
import re
from .parser import BNF, RegexInplaceParser, HybridExpressionParser, Core, equation_base
class Expression(HybridExpressionParser):
expression = None
core = None
def __init__(self,expression,argorder=[],*args,**kwargs):
self.core = Core()
equation_base.equation_extend(self.core)
self.core.recalculateFMatch()
super().__init__(self.core, expression, argorder=[],*args,**kwargs)

View File

@ -0,0 +1,22 @@
# Titan Robotics Team 2022: Expression submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis import Equation'
# setup:
__version__ = "0.0.1-alpha"
__changelog__ = """changelog:
0.0.1-alpha:
- made first prototype of Expression
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = {
"Expression"
}
from .Expression import Expression

View File

@ -0,0 +1,97 @@
from __future__ import division
from pyparsing import (Literal, CaselessLiteral, Word, Combine, Group, Optional, ZeroOrMore, Forward, nums, alphas, oneOf)
from . import py2
import math
import operator
class BNF(object):
def pushFirst(self, strg, loc, toks):
self.exprStack.append(toks[0])
def pushUMinus(self, strg, loc, toks):
if toks and toks[0] == '-':
self.exprStack.append('unary -')
def __init__(self):
"""
expop :: '^'
multop :: '*' | '/'
addop :: '+' | '-'
integer :: ['+' | '-'] '0'..'9'+
atom :: PI | E | real | fn '(' expr ')' | '(' expr ')'
factor :: atom [ expop factor ]*
term :: factor [ multop factor ]*
expr :: term [ addop term ]*
"""
point = Literal(".")
e = CaselessLiteral("E")
fnumber = Combine(Word("+-" + nums, nums) +
Optional(point + Optional(Word(nums))) +
Optional(e + Word("+-" + nums, nums)))
ident = Word(alphas, alphas + nums + "_$")
plus = Literal("+")
minus = Literal("-")
mult = Literal("*")
div = Literal("/")
lpar = Literal("(").suppress()
rpar = Literal(")").suppress()
addop = plus | minus
multop = mult | div
expop = Literal("^")
pi = CaselessLiteral("PI")
expr = Forward()
atom = ((Optional(oneOf("- +")) +
(ident + lpar + expr + rpar | pi | e | fnumber).setParseAction(self.pushFirst))
| Optional(oneOf("- +")) + Group(lpar + expr + rpar)
).setParseAction(self.pushUMinus)
factor = Forward()
factor << atom + \
ZeroOrMore((expop + factor).setParseAction(self.pushFirst))
term = factor + \
ZeroOrMore((multop + factor).setParseAction(self.pushFirst))
expr << term + \
ZeroOrMore((addop + term).setParseAction(self.pushFirst))
self.bnf = expr
epsilon = 1e-12
self.opn = {"+": operator.add,
"-": operator.sub,
"*": operator.mul,
"/": operator.truediv,
"^": operator.pow}
self.fn = {"sin": math.sin,
"cos": math.cos,
"tan": math.tan,
"exp": math.exp,
"abs": abs,
"trunc": lambda a: int(a),
"round": round,
"sgn": lambda a: abs(a) > epsilon and py2.cmp(a, 0) or 0}
def evaluateStack(self, s):
op = s.pop()
if op == 'unary -':
return -self.evaluateStack(s)
if op in "+-*/^":
op2 = self.evaluateStack(s)
op1 = self.evaluateStack(s)
return self.opn[op](op1, op2)
elif op == "PI":
return math.pi
elif op == "E":
return math.e
elif op in self.fn:
return self.fn[op](self.evaluateStack(s))
elif op[0].isalpha():
return 0
else:
return float(op)
def eval(self, num_string, parseAll=True):
self.exprStack = []
results = self.bnf.parseString(num_string, parseAll)
val = self.evaluateStack(self.exprStack[:])
return val

View File

@ -0,0 +1,521 @@
from .Hybrid_Utils import Core, ExpressionFunction, ExpressionVariable, ExpressionValue
import sys
if sys.version_info >= (3,):
xrange = range
basestring = str
class HybridExpressionParser(object):
def __init__(self,core,expression,argorder=[],*args,**kwargs):
super(HybridExpressionParser,self).__init__(*args,**kwargs)
if isinstance(expression,type(self)): # clone the object
self.core = core
self.__args = list(expression.__args)
self.__vars = dict(expression.__vars) # intenral array of preset variables
self.__argsused = set(expression.__argsused)
self.__expr = list(expression.__expr)
self.variables = {} # call variables
else:
self.__expression = expression
self.__args = argorder;
self.__vars = {} # intenral array of preset variables
self.__argsused = set()
self.__expr = [] # compiled equation tokens
self.variables = {} # call variables
self.__compile()
del self.__expression
def __getitem__(self, name):
if name in self.__argsused:
if name in self.__vars:
return self.__vars[name]
else:
return None
else:
raise KeyError(name)
def __setitem__(self,name,value):
if name in self.__argsused:
self.__vars[name] = value
else:
raise KeyError(name)
def __delitem__(self,name):
if name in self.__argsused:
if name in self.__vars:
del self.__vars[name]
else:
raise KeyError(name)
def __contains__(self, name):
return name in self.__argsused
def __call__(self,*args,**kwargs):
if len(self.__expr) == 0:
return None
self.variables = {}
self.variables.update(self.core.constants)
self.variables.update(self.__vars)
if len(args) > len(self.__args):
raise TypeError("<{0:s}.{1:s}({2:s}) object at {3:0=#10x}>() takes at most {4:d} arguments ({5:d} given)".format(
type(self).__module__,type(self).__name__,repr(self),id(self),len(self.__args),len(args)))
for i in xrange(len(args)):
if i < len(self.__args):
if self.__args[i] in kwargs:
raise TypeError("<{0:s}.{1:s}({2:s}) object at {3:0=#10x}>() got multiple values for keyword argument '{4:s}'".format(
type(self).__module__,type(self).__name__,repr(self),id(self),self.__args[i]))
self.variables[self.__args[i]] = args[i]
self.variables.update(kwargs)
for arg in self.__argsused:
if arg not in self.variables:
min_args = len(self.__argsused - (set(self.__vars.keys()) | set(self.core.constants.keys())))
raise TypeError("<{0:s}.{1:s}({2:s}) object at {3:0=#10x}>() takes at least {4:d} arguments ({5:d} given) '{6:s}' not defined".format(
type(self).__module__,type(self).__name__,repr(self),id(self),min_args,len(args)+len(kwargs),arg))
expr = self.__expr[::-1]
args = []
while len(expr) > 0:
t = expr.pop()
r = t(args,self)
args.append(r)
if len(args) > 1:
return args
else:
return args[0]
def __next(self,__expect_op):
if __expect_op:
m = self.core.gematch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'CLOSE'
m = self.core.smatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
return ",",'SEP'
m = self.core.omatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'OP'
else:
m = self.core.gsmatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'OPEN'
m = self.core.vmatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groupdict(0)
if g['dec']:
if g["ivalue"]:
return complex(int(g["rsign"]+"1")*float(g["rvalue"])*10**int(g["rexpoent"]),int(g["isign"]+"1")*float(g["ivalue"])*10**int(g["iexpoent"])),'VALUE'
elif g["rexpoent"] or g["rvalue"].find('.')>=0:
return int(g["rsign"]+"1")*float(g["rvalue"])*10**int(g["rexpoent"]),'VALUE'
else:
return int(g["rsign"]+"1")*int(g["rvalue"]),'VALUE'
elif g["hex"]:
return int(g["hexsign"]+"1")*int(g["hexvalue"],16),'VALUE'
elif g["oct"]:
return int(g["octsign"]+"1")*int(g["octvalue"],8),'VALUE'
elif g["bin"]:
return int(g["binsign"]+"1")*int(g["binvalue"],2),'VALUE'
else:
raise NotImplemented("'{0:s}' Values Not Implemented Yet".format(m.string))
m = self.core.nmatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'NAME'
m = self.core.fmatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'FUNC'
m = self.core.umatch.match(self.__expression)
if m != None:
self.__expression = self.__expression[m.end():]
g = m.groups()
return g[0],'UNARY'
return None
def show(self):
"""Show RPN tokens
This will print out the internal token list (RPN) of the expression
one token perline.
"""
for expr in self.__expr:
print(expr)
def __str__(self):
"""str(fn)
Generates a Printable version of the Expression
Returns
-------
str
Latex String respresation of the Expression, suitable for rendering the equation
"""
expr = self.__expr[::-1]
if len(expr) == 0:
return ""
args = [];
while len(expr) > 0:
t = expr.pop()
r = t.toStr(args,self)
args.append(r)
if len(args) > 1:
return args
else:
return args[0]
def __repr__(self):
"""repr(fn)
Generates a String that correctrly respresents the equation
Returns
-------
str
Convert the Expression to a String that passed to the constructor, will constuct
an identical equation object (in terms of sequence of tokens, and token type/value)
"""
expr = self.__expr[::-1]
if len(expr) == 0:
return ""
args = [];
while len(expr) > 0:
t = expr.pop()
r = t.toRepr(args,self)
args.append(r)
if len(args) > 1:
return args
else:
return args[0]
def __iter__(self):
return iter(self.__argsused)
def __lt__(self, other):
if isinstance(other, Expression):
return repr(self) < repr(other)
else:
raise TypeError("{0:s} is not an {1:s} Object, and can't be compared to an Expression Object".format(repr(other), type(other)))
def __eq__(self, other):
if isinstance(other, Expression):
return repr(self) == repr(other)
else:
raise TypeError("{0:s} is not an {1:s} Object, and can't be compared to an Expression Object".format(repr(other), type(other)))
def __combine(self,other,op):
if op not in self.core.ops or not isinstance(other,(int,float,complex,type(self),basestring)):
return NotImplemented
else:
obj = type(self)(self)
if isinstance(other,(int,float,complex)):
obj.__expr.append(ExpressionValue(other))
else:
if isinstance(other,basestring):
try:
other = type(self)(other)
except:
raise SyntaxError("Can't Convert string, \"{0:s}\" to an Expression Object".format(other))
obj.__expr += other.__expr
obj.__argsused |= other.__argsused
for v in other.__args:
if v not in obj.__args:
obj.__args.append(v)
for k,v in other.__vars.items():
if k not in obj.__vars:
obj.__vars[k] = v
elif v != obj.__vars[k]:
raise RuntimeError("Predifined Variable Conflict in '{0:s}' two differing values defined".format(k))
fn = self.core.ops[op]
obj.__expr.append(ExpressionFunction(fn['func'],fn['args'],fn['str'],fn['latex'],op,False))
return obj
def __rcombine(self,other,op):
if op not in self.core.ops or not isinstance(other,(int,float,complex,type(self),basestring)):
return NotImplemented
else:
obj = type(self)(self)
if isinstance(other,(int,float,complex)):
obj.__expr.insert(0,ExpressionValue(other))
else:
if isinstance(other,basestring):
try:
other = type(self)(other)
except:
raise SyntaxError("Can't Convert string, \"{0:s}\" to an Expression Object".format(other))
obj.__expr = other.__expr + self.__expr
obj.__argsused = other.__argsused | self.__expr
__args = other.__args
for v in obj.__args:
if v not in __args:
__args.append(v)
obj.__args = __args
for k,v in other.__vars.items():
if k not in obj.__vars:
obj.__vars[k] = v
elif v != obj.__vars[k]:
raise RuntimeError("Predifined Variable Conflict in '{0:s}' two differing values defined".format(k))
fn = self.core.ops[op]
obj.__expr.append(ExpressionFunction(fn['func'],fn['args'],fn['str'],fn['latex'],op,False))
return obj
def __icombine(self,other,op):
if op not in self.core.ops or not isinstance(other,(int,float,complex,type(self),basestring)):
return NotImplemented
else:
obj = self
if isinstance(other,(int,float,complex)):
obj.__expr.append(ExpressionValue(other))
else:
if isinstance(other,basestring):
try:
other = type(self)(other)
except:
raise SyntaxError("Can't Convert string, \"{0:s}\" to an Expression Object".format(other))
obj.__expr += other.__expr
obj.__argsused |= other.__argsused
for v in other.__args:
if v not in obj.__args:
obj.__args.append(v)
for k,v in other.__vars.items():
if k not in obj.__vars:
obj.__vars[k] = v
elif v != obj.__vars[k]:
raise RuntimeError("Predifined Variable Conflict in '{0:s}' two differing values defined".format(k))
fn = self.core.ops[op]
obj.__expr.append(ExpressionFunction(fn['func'],fn['args'],fn['str'],fn['latex'],op,False))
return obj
def __apply(self,op):
fn = self.core.unary_ops[op]
obj = type(self)(self)
obj.__expr.append(ExpressionFunction(fn['func'],1,fn['str'],fn['latex'],op,False))
return obj
def __applycall(self,op):
fn = self.core.functions[op]
if 1 not in fn['args'] or '*' not in fn['args']:
raise RuntimeError("Can't Apply {0:s} function, dosen't accept only 1 argument".format(op))
obj = type(self)(self)
obj.__expr.append(ExpressionFunction(fn['func'],1,fn['str'],fn['latex'],op,False))
return obj
def __add__(self,other):
return self.__combine(other,'+')
def __sub__(self,other):
return self.__combine(other,'-')
def __mul__(self,other):
return self.__combine(other,'*')
def __div__(self,other):
return self.__combine(other,'/')
def __truediv__(self,other):
return self.__combine(other,'/')
def __pow__(self,other):
return self.__combine(other,'^')
def __mod__(self,other):
return self.__combine(other,'%')
def __and__(self,other):
return self.__combine(other,'&')
def __or__(self,other):
return self.__combine(other,'|')
def __xor__(self,other):
return self.__combine(other,'</>')
def __radd__(self,other):
return self.__rcombine(other,'+')
def __rsub__(self,other):
return self.__rcombine(other,'-')
def __rmul__(self,other):
return self.__rcombine(other,'*')
def __rdiv__(self,other):
return self.__rcombine(other,'/')
def __rtruediv__(self,other):
return self.__rcombine(other,'/')
def __rpow__(self,other):
return self.__rcombine(other,'^')
def __rmod__(self,other):
return self.__rcombine(other,'%')
def __rand__(self,other):
return self.__rcombine(other,'&')
def __ror__(self,other):
return self.__rcombine(other,'|')
def __rxor__(self,other):
return self.__rcombine(other,'</>')
def __iadd__(self,other):
return self.__icombine(other,'+')
def __isub__(self,other):
return self.__icombine(other,'-')
def __imul__(self,other):
return self.__icombine(other,'*')
def __idiv__(self,other):
return self.__icombine(other,'/')
def __itruediv__(self,other):
return self.__icombine(other,'/')
def __ipow__(self,other):
return self.__icombine(other,'^')
def __imod__(self,other):
return self.__icombine(other,'%')
def __iand__(self,other):
return self.__icombine(other,'&')
def __ior__(self,other):
return self.__icombine(other,'|')
def __ixor__(self,other):
return self.__icombine(other,'</>')
def __neg__(self):
return self.__apply('-')
def __invert__(self):
return self.__apply('!')
def __abs__(self):
return self.__applycall('abs')
def __getfunction(self,op):
if op[1] == 'FUNC':
fn = self.core.functions[op[0]]
fn['type'] = 'FUNC'
elif op[1] == 'UNARY':
fn = self.core.unary_ops[op[0]]
fn['type'] = 'UNARY'
fn['args'] = 1
elif op[1] == 'OP':
fn = self.core.ops[op[0]]
fn['type'] = 'OP'
return fn
def __compile(self):
self.__expr = []
stack = []
argc = []
__expect_op = False
v = self.__next(__expect_op)
while v != None:
if not __expect_op and v[1] == "OPEN":
stack.append(v)
__expect_op = False
elif __expect_op and v[1] == "CLOSE":
op = stack.pop()
while op[1] != "OPEN":
fs = self.__getfunction(op)
self.__expr.append(ExpressionFunction(fs['func'],fs['args'],fs['str'],fs['latex'],op[0],False))
op = stack.pop()
if len(stack) > 0 and stack[-1][0] in self.core.functions:
op = stack.pop()
fs = self.core.functions[op[0]]
args = argc.pop()
if fs['args'] != '+' and (args != fs['args'] and args not in fs['args']):
raise SyntaxError("Invalid number of arguments for {0:s} function".format(op[0]))
self.__expr.append(ExpressionFunction(fs['func'],args,fs['str'],fs['latex'],op[0],True))
__expect_op = True
elif __expect_op and v[0] == ",":
argc[-1] += 1
op = stack.pop()
while op[1] != "OPEN":
fs = self.__getfunction(op)
self.__expr.append(ExpressionFunction(fs['func'],fs['args'],fs['str'],fs['latex'],op[0],False))
op = stack.pop()
stack.append(op)
__expect_op = False
elif __expect_op and v[0] in self.core.ops:
fn = self.core.ops[v[0]]
if len(stack) == 0:
stack.append(v)
__expect_op = False
v = self.__next(__expect_op)
continue
op = stack.pop()
if op[0] == "(":
stack.append(op)
stack.append(v)
__expect_op = False
v = self.__next(__expect_op)
continue
fs = self.__getfunction(op)
while True:
if (fn['prec'] >= fs['prec']):
self.__expr.append(ExpressionFunction(fs['func'],fs['args'],fs['str'],fs['latex'],op[0],False))
if len(stack) == 0:
stack.append(v)
break
op = stack.pop()
if op[0] == "(":
stack.append(op)
stack.append(v)
break
fs = self.__getfunction(op)
else:
stack.append(op)
stack.append(v)
break
__expect_op = False
elif not __expect_op and v[0] in self.core.unary_ops:
fn = self.core.unary_ops[v[0]]
stack.append(v)
__expect_op = False
elif not __expect_op and v[0] in self.core.functions:
stack.append(v)
argc.append(1)
__expect_op = False
elif not __expect_op and v[1] == 'NAME':
self.__argsused.add(v[0])
if v[0] not in self.__args:
self.__args.append(v[0])
self.__expr.append(ExpressionVariable(v[0]))
__expect_op = True
elif not __expect_op and v[1] == 'VALUE':
self.__expr.append(ExpressionValue(v[0]))
__expect_op = True
else:
raise SyntaxError("Invalid Token \"{0:s}\" in Expression, Expected {1:s}".format(v,"Op" if __expect_op else "Value"))
v = self.__next(__expect_op)
if len(stack) > 0:
op = stack.pop()
while op != "(":
fs = self.__getfunction(op)
self.__expr.append(ExpressionFunction(fs['func'],fs['args'],fs['str'],fs['latex'],op[0],False))
if len(stack) > 0:
op = stack.pop()
else:
break

View File

@ -0,0 +1,237 @@
import math
import sys
import re
if sys.version_info >= (3,):
xrange = range
basestring = str
class ExpressionObject(object):
def __init__(self,*args,**kwargs):
super(ExpressionObject,self).__init__(*args,**kwargs)
def toStr(self,args,expression):
return ""
def toRepr(self,args,expression):
return ""
def __call__(self,args,expression):
pass
class ExpressionValue(ExpressionObject):
def __init__(self,value,*args,**kwargs):
super(ExpressionValue,self).__init__(*args,**kwargs)
self.value = value
def toStr(self,args,expression):
if (isinstance(self.value,complex)):
V = [self.value.real,self.value.imag]
E = [0,0]
B = [0,0]
out = ["",""]
for i in xrange(2):
if V[i] == 0:
E[i] = 0
B[i] = 0
else:
E[i] = int(math.floor(math.log10(abs(V[i]))))
B[i] = V[i]*10**-E[i]
if E[i] in [0,1,2,3] and str(V[i])[-2:] == ".0":
B[i] = int(V[i])
E[i] = 0
if E[i] in [-1,-2] and len(str(V[i])) <= 7:
B[i] = V[i]
E[i] = 0
if i == 1:
fmt = "{{0:+{0:s}}}"
else:
fmt = "{{0:-{0:s}}}"
if type(B[i]) == int:
out[i] += fmt.format('d').format(B[i])
else:
out[i] += fmt.format('.5f').format(B[i]).rstrip("0.")
if i == 1:
out[i] += "\\imath"
if E[i] != 0:
out[i] += "\\times10^{{{0:d}}}".format(E[i])
return "\\left(" + ''.join(out) + "\\right)"
elif (isinstance(self.value,float)):
V = self.value
E = 0
B = 0
out = ""
if V == 0:
E = 0
B = 0
else:
E = int(math.floor(math.log10(abs(V))))
B = V*10**-E
if E in [0,1,2,3] and str(V)[-2:] == ".0":
B = int(V)
E = 0
if E in [-1,-2] and len(str(V)) <= 7:
B = V
E = 0
if type(B) == int:
out += "{0:-d}".format(B)
else:
out += "{0:-.5f}".format(B).rstrip("0.")
if E != 0:
out += "\\times10^{{{0:d}}}".format(E)
return "\\left(" + out + "\\right)"
else:
return out
else:
return str(self.value)
def toRepr(self,args,expression):
return str(self.value)
def __call__(self,args,expression):
return self.value
def __repr__(self):
return "<{0:s}.{1:s}({2:s}) object at {3:0=#10x}>".format(type(self).__module__,type(self).__name__,str(self.value),id(self))
class ExpressionFunction(ExpressionObject):
def __init__(self,function,nargs,form,display,id,isfunc,*args,**kwargs):
super(ExpressionFunction,self).__init__(*args,**kwargs)
self.function = function
self.nargs = nargs
self.form = form
self.display = display
self.id = id
self.isfunc = isfunc
def toStr(self,args,expression):
params = []
for i in xrange(self.nargs):
params.append(args.pop())
if self.isfunc:
return str(self.display.format(','.join(params[::-1])))
else:
return str(self.display.format(*params[::-1]))
def toRepr(self,args,expression):
params = []
for i in xrange(self.nargs):
params.append(args.pop())
if self.isfunc:
return str(self.form.format(','.join(params[::-1])))
else:
return str(self.form.format(*params[::-1]))
def __call__(self,args,expression):
params = []
for i in xrange(self.nargs):
params.append(args.pop())
return self.function(*params[::-1])
def __repr__(self):
return "<{0:s}.{1:s}({2:s},{3:d}) object at {4:0=#10x}>".format(type(self).__module__,type(self).__name__,str(self.id),self.nargs,id(self))
class ExpressionVariable(ExpressionObject):
def __init__(self,name,*args,**kwargs):
super(ExpressionVariable,self).__init__(*args,**kwargs)
self.name = name
def toStr(self,args,expression):
return str(self.name)
def toRepr(self,args,expression):
return str(self.name)
def __call__(self,args,expression):
if self.name in expression.variables:
return expression.variables[self.name]
else:
return 0 # Default variables to return 0
def __repr__(self):
return "<{0:s}.{1:s}({2:s}) object at {3:0=#10x}>".format(type(self).__module__,type(self).__name__,str(self.name),id(self))
class Core():
constants = {}
unary_ops = {}
ops = {}
functions = {}
smatch = re.compile(r"\s*,")
vmatch = re.compile(r"\s*"
"(?:"
"(?P<oct>"
"(?P<octsign>[+-]?)"
r"\s*0o"
"(?P<octvalue>[0-7]+)"
")|(?P<hex>"
"(?P<hexsign>[+-]?)"
r"\s*0x"
"(?P<hexvalue>[0-9a-fA-F]+)"
")|(?P<bin>"
"(?P<binsign>[+-]?)"
r"\s*0b"
"(?P<binvalue>[01]+)"
")|(?P<dec>"
"(?P<rsign>[+-]?)"
r"\s*"
r"(?P<rvalue>(?:\d+\.\d+|\d+\.|\.\d+|\d+))"
"(?:"
"[Ee]"
r"(?P<rexpoent>[+-]?\d+)"
")?"
"(?:"
r"\s*"
r"(?P<sep>(?(rvalue)\+|))?"
r"\s*"
"(?P<isign>(?(rvalue)(?(sep)[+-]?|[+-])|[+-]?)?)"
r"\s*"
r"(?P<ivalue>(?:\d+\.\d+|\d+\.|\.\d+|\d+))"
"(?:"
"[Ee]"
r"(?P<iexpoent>[+-]?\d+)"
")?"
"[ij]"
")?"
")"
")")
nmatch = re.compile(r"\s*([a-zA-Z_][a-zA-Z0-9_]*)")
gsmatch = re.compile(r'\s*(\()')
gematch = re.compile(r'\s*(\))')
def recalculateFMatch(self):
fks = sorted(self.functions.keys(), key=len, reverse=True)
oks = sorted(self.ops.keys(), key=len, reverse=True)
uks = sorted(self.unary_ops.keys(), key=len, reverse=True)
self.fmatch = re.compile(r'\s*(' + '|'.join(map(re.escape,fks)) + ')')
self.omatch = re.compile(r'\s*(' + '|'.join(map(re.escape,oks)) + ')')
self.umatch = re.compile(r'\s*(' + '|'.join(map(re.escape,uks)) + ')')
def addFn(self,id,str,latex,args,func):
self.functions[id] = {
'str': str,
'latex': latex,
'args': args,
'func': func}
def addOp(self,id,str,latex,single,prec,func):
if single:
raise RuntimeError("Single Ops Not Yet Supported")
self.ops[id] = {
'str': str,
'latex': latex,
'args': 2,
'prec': prec,
'func': func}
def addUnaryOp(self,id,str,latex,func):
self.unary_ops[id] = {
'str': str,
'latex': latex,
'args': 1,
'prec': 0,
'func': func}
def addConst(self,name,value):
self.constants[name] = value

View File

@ -0,0 +1,2 @@
from . import equation_base as equation_base
from .ExpressionCore import ExpressionValue, ExpressionFunction, ExpressionVariable, Core

View File

@ -0,0 +1,106 @@
try:
import numpy as np
has_numpy = True
except ImportError:
import math
has_numpy = False
try:
import scipy.constants
has_scipy = True
except ImportError:
has_scipy = False
import operator as op
from .similar import sim, nsim, gsim, lsim
def equation_extend(core):
def product(*args):
if len(args) == 1 and has_numpy:
return np.prod(args[0])
else:
return reduce(op.mul,args,1)
def sumargs(*args):
if len(args) == 1:
return sum(args[0])
else:
return sum(args)
core.addOp('+',"({0:s} + {1:s})","\\left({0:s} + {1:s}\\right)",False,3,op.add)
core.addOp('-',"({0:s} - {1:s})","\\left({0:s} - {1:s}\\right)",False,3,op.sub)
core.addOp('*',"({0:s} * {1:s})","\\left({0:s} \\times {1:s}\\right)",False,2,op.mul)
core.addOp('/',"({0:s} / {1:s})","\\frac{{{0:s}}}{{{1:s}}}",False,2,op.truediv)
core.addOp('%',"({0:s} % {1:s})","\\left({0:s} \\bmod {1:s}\\right)",False,2,op.mod)
core.addOp('^',"({0:s} ^ {1:s})","{0:s}^{{{1:s}}}",False,1,op.pow)
core.addOp('**',"({0:s} ^ {1:s})","{0:s}^{{{1:s}}}",False,1,op.pow)
core.addOp('&',"({0:s} & {1:s})","\\left({0:s} \\land {1:s}\\right)",False,4,op.and_)
core.addOp('|',"({0:s} | {1:s})","\\left({0:s} \\lor {1:s}\\right)",False,4,op.or_)
core.addOp('</>',"({0:s} </> {1:s})","\\left({0:s} \\oplus {1:s}\\right)",False,4,op.xor)
core.addOp('&|',"({0:s} </> {1:s})","\\left({0:s} \\oplus {1:s}\\right)",False,4,op.xor)
core.addOp('|&',"({0:s} </> {1:s})","\\left({0:s} \\oplus {1:s}\\right)",False,4,op.xor)
core.addOp('==',"({0:s} == {1:s})","\\left({0:s} = {1:s}\\right)",False,5,op.eq)
core.addOp('=',"({0:s} == {1:s})","\\left({0:s} = {1:s}\\right)",False,5,op.eq)
core.addOp('~',"({0:s} ~ {1:s})","\\left({0:s} \\approx {1:s}\\right)",False,5,sim)
core.addOp('!~',"({0:s} !~ {1:s})","\\left({0:s} \\not\\approx {1:s}\\right)",False,5,nsim)
core.addOp('!=',"({0:s} != {1:s})","\\left({0:s} \\neg {1:s}\\right)",False,5,op.ne)
core.addOp('<>',"({0:s} != {1:s})","\\left({0:s} \\neg {1:s}\\right)",False,5,op.ne)
core.addOp('><',"({0:s} != {1:s})","\\left({0:s} \\neg {1:s}\\right)",False,5,op.ne)
core.addOp('<',"({0:s} < {1:s})","\\left({0:s} < {1:s}\\right)",False,5,op.lt)
core.addOp('>',"({0:s} > {1:s})","\\left({0:s} > {1:s}\\right)",False,5,op.gt)
core.addOp('<=',"({0:s} <= {1:s})","\\left({0:s} \\leq {1:s}\\right)",False,5,op.le)
core.addOp('>=',"({0:s} >= {1:s})","\\left({0:s} \\geq {1:s}\\right)",False,5,op.ge)
core.addOp('=<',"({0:s} <= {1:s})","\\left({0:s} \\leq {1:s}\\right)",False,5,op.le)
core.addOp('=>',"({0:s} >= {1:s})","\\left({0:s} \\geq {1:s}\\right)",False,5,op.ge)
core.addOp('<~',"({0:s} <~ {1:s})","\\left({0:s} \lessapprox {1:s}\\right)",False,5,lsim)
core.addOp('>~',"({0:s} >~ {1:s})","\\left({0:s} \\gtrapprox {1:s}\\right)",False,5,gsim)
core.addOp('~<',"({0:s} <~ {1:s})","\\left({0:s} \lessapprox {1:s}\\right)",False,5,lsim)
core.addOp('~>',"({0:s} >~ {1:s})","\\left({0:s} \\gtrapprox {1:s}\\right)",False,5,gsim)
core.addUnaryOp('!',"(!{0:s})","\\neg{0:s}",op.not_)
core.addUnaryOp('-',"-{0:s}","-{0:s}",op.neg)
core.addFn('abs',"abs({0:s})","\\left|{0:s}\\right|",1,op.abs)
core.addFn('sum',"sum({0:s})","\\sum\\left({0:s}\\right)",'+',sumargs)
core.addFn('prod',"prod({0:s})","\\prod\\left({0:s}\\right)",'+',product)
if has_numpy:
core.addFn('floor',"floor({0:s})","\\lfloor {0:s} \\rfloor",1,np.floor)
core.addFn('ceil',"ceil({0:s})","\\lceil {0:s} \\rceil",1,np.ceil)
core.addFn('round',"round({0:s})","\\lfloor {0:s} \\rceil",1,np.round)
core.addFn('sin',"sin({0:s})","\\sin\\left({0:s}\\right)",1,np.sin)
core.addFn('cos',"cos({0:s})","\\cos\\left({0:s}\\right)",1,np.cos)
core.addFn('tan',"tan({0:s})","\\tan\\left({0:s}\\right)",1,np.tan)
core.addFn('re',"re({0:s})","\\Re\\left({0:s}\\right)",1,np.real)
core.addFn('im',"re({0:s})","\\Im\\left({0:s}\\right)",1,np.imag)
core.addFn('sqrt',"sqrt({0:s})","\\sqrt{{{0:s}}}",1,np.sqrt)
core.addConst("pi",np.pi)
core.addConst("e",np.e)
core.addConst("Inf",np.Inf)
core.addConst("NaN",np.NaN)
else:
core.addFn('floor',"floor({0:s})","\\lfloor {0:s} \\rfloor",1,math.floor)
core.addFn('ceil',"ceil({0:s})","\\lceil {0:s} \\rceil",1,math.ceil)
core.addFn('round',"round({0:s})","\\lfloor {0:s} \\rceil",1,round)
core.addFn('sin',"sin({0:s})","\\sin\\left({0:s}\\right)",1,math.sin)
core.addFn('cos',"cos({0:s})","\\cos\\left({0:s}\\right)",1,math.cos)
core.addFn('tan',"tan({0:s})","\\tan\\left({0:s}\\right)",1,math.tan)
core.addFn('re',"re({0:s})","\\Re\\left({0:s}\\right)",1,complex.real)
core.addFn('im',"re({0:s})","\\Im\\left({0:s}\\right)",1,complex.imag)
core.addFn('sqrt',"sqrt({0:s})","\\sqrt{{{0:s}}}",1,math.sqrt)
core.addConst("pi",math.pi)
core.addConst("e",math.e)
core.addConst("Inf",float("Inf"))
core.addConst("NaN",float("NaN"))
if has_scipy:
core.addConst("h",scipy.constants.h)
core.addConst("hbar",scipy.constants.hbar)
core.addConst("m_e",scipy.constants.m_e)
core.addConst("m_p",scipy.constants.m_p)
core.addConst("m_n",scipy.constants.m_n)
core.addConst("c",scipy.constants.c)
core.addConst("N_A",scipy.constants.N_A)
core.addConst("mu_0",scipy.constants.mu_0)
core.addConst("eps_0",scipy.constants.epsilon_0)
core.addConst("k",scipy.constants.k)
core.addConst("G",scipy.constants.G)
core.addConst("g",scipy.constants.g)
core.addConst("q",scipy.constants.e)
core.addConst("R",scipy.constants.R)
core.addConst("sigma",scipy.constants.e)
core.addConst("Rb",scipy.constants.Rydberg)

View File

@ -0,0 +1,49 @@
_tol = 1e-5
def sim(a,b):
if (a==b):
return True
elif a == 0 or b == 0:
return False
if (a<b):
return (1-a/b)<=_tol
else:
return (1-b/a)<=_tol
def nsim(a,b):
if (a==b):
return False
elif a == 0 or b == 0:
return True
if (a<b):
return (1-a/b)>_tol
else:
return (1-b/a)>_tol
def gsim(a,b):
if a >= b:
return True
return (1-a/b)<=_tol
def lsim(a,b):
if a <= b:
return True
return (1-b/a)<=_tol
def set_tol(value=1e-5):
r"""Set Error Tolerance
Set the tolerance for detriming if two numbers are simliar, i.e
:math:`\left|\frac{a}{b}\right| = 1 \pm tolerance`
Parameters
----------
value: float
The Value to set the tolerance to show be very small as it respresents the
percentage of acceptable error in detriming if two values are the same.
"""
global _tol
if isinstance(value,float):
_tol = value
else:
raise TypeError(type(value))

View File

@ -0,0 +1,51 @@
import re
from decimal import Decimal
from functools import reduce
class RegexInplaceParser(object):
def __init__(self, string):
self.string = string
def add(self, string):
while(len(re.findall("[+]{1}[-]?", string)) != 0):
string = re.sub("[-]?\d+[.]?\d*[+]{1}[-]?\d+[.]?\d*", str("%f" % reduce((lambda x, y: x + y), [Decimal(i) for i in re.split("[+]{1}", re.search("[-]?\d+[.]?\d*[+]{1}[-]?\d+[.]?\d*", string).group())])), string, 1)
return string
def sub(self, string):
while(len(re.findall("\d+[.]?\d*[-]{1,2}\d+[.]?\d*", string)) != 0):
g = re.search("\d+[.]?\d*[-]{1,2}\d+[.]?\d*", string).group()
if(re.search("[-]{1,2}", g).group() == "-"):
r = re.sub("[-]{1}", "+-", g, 1)
string = re.sub(g, r, string, 1)
elif(re.search("[-]{1,2}", g).group() == "--"):
r = re.sub("[-]{2}", "+", g, 1)
string = re.sub(g, r, string, 1)
else:
pass
return string
def mul(self, string):
while(len(re.findall("[*]{1}[-]?", string)) != 0):
string = re.sub("[-]?\d+[.]?\d*[*]{1}[-]?\d+[.]?\d*", str("%f" % reduce((lambda x, y: x * y), [Decimal(i) for i in re.split("[*]{1}", re.search("[-]?\d+[.]?\d*[*]{1}[-]?\d+[.]?\d*", string).group())])), string, 1)
return string
def div(self, string):
while(len(re.findall("[/]{1}[-]?", string)) != 0):
string = re.sub("[-]?\d+[.]?\d*[/]{1}[-]?\d+[.]?\d*", str("%f" % reduce((lambda x, y: x / y), [Decimal(i) for i in re.split("[/]{1}", re.search("[-]?\d+[.]?\d*[/]{1}[-]?\d+[.]?\d*", string).group())])), string, 1)
return string
def exp(self, string):
while(len(re.findall("[\^]{1}[-]?", string)) != 0):
string = re.sub("[-]?\d+[.]?\d*[\^]{1}[-]?\d+[.]?\d*", str("%f" % reduce((lambda x, y: x ** y), [Decimal(i) for i in re.split("[\^]{1}", re.search("[-]?\d+[.]?\d*[\^]{1}[-]?\d+[.]?\d*", string).group())])), string, 1)
return string
def evaluate(self):
string = self.string
string = self.exp(string)
string = self.div(string)
string = self.mul(string)
string = self.sub(string)
string = self.add(string)
return string

View File

@ -0,0 +1,34 @@
# Titan Robotics Team 2022: Expression submodule
# Written by Arthur Lu
# Notes:
# this should be imported as a python module using 'from tra_analysis.Equation import parser'
# setup:
__version__ = "0.0.4-alpha"
__changelog__ = """changelog:
0.0.4-alpha:
- moved individual parsers to their own files
0.0.3-alpha:
- readded old regex based parser as RegexInplaceParser
0.0.2-alpha:
- wrote BNF using pyparsing and uses a BNF metasyntax
- renamed this submodule parser
0.0.1-alpha:
- took items from equation.ipynb and ported here
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = {
"BNF",
"RegexInplaceParser",
"HybridExpressionParser"
}
from .BNF import BNF as BNF
from .RegexInplaceParser import RegexInplaceParser as RegexInplaceParser
from .Hybrid import HybridExpressionParser
from .Hybrid_Utils import equation_base, Core

View File

@ -0,0 +1,21 @@
# Titan Robotics Team 2022: py2 module
# Written by Arthur Lu
# Notes:
# this module should only be used internally, contains old python 2.X functions that have been removed.
# setup:
from __future__ import division
__version__ = "1.0.0"
__changelog__ = """changelog:
1.0.0:
- added cmp function
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
)
def cmp(a, b):
return (a > b) - (a < b)

View File

@ -1,46 +0,0 @@
{
"max-threads": 0.5,
"team": "",
"competition": "",
"key":{
"database":"",
"tba":""
},
"statistics":{
"match":{
"balls-blocked":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-collected":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-lower-teleop":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-lower-auto":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-started":["basic_stats","historical_analyss","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-upper-teleop":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"],
"balls-upper-auto":["basic_stats","historical_analysis","regression_linear","regression_logarithmic","regression_exponential","regression_polynomial","regression_sigmoidal"]
},
"metric":{
"elo":{
"score":1500,
"N":400,
"K":24
},
"gl2":{
"score":1500,
"rd":250,
"vol":0.06
},
"ts":{
"mu":25,
"sigma":8.33
}
},
"pit":{
"wheel-mechanism":true,
"low-balls":true,
"high-balls":true,
"wheel-success":true,
"strategic-focus":true,
"climb-mechanism":true,
"attitude":true
}
}
}

View File

@ -1,129 +0,0 @@
import requests
import pymongo
import pandas as pd
import time
def pull_new_tba_matches(apikey, competition, cutoff):
api_key= apikey
x=requests.get("https://www.thebluealliance.com/api/v3/event/"+competition+"/matches/simple", headers={"X-TBA-Auth_Key":api_key})
out = []
for i in x.json():
if i["actual_time"] != None and i["actual_time"]-cutoff >= 0 and i["comp_level"] == "qm":
out.append({"match" : i['match_number'], "blue" : list(map(lambda x: int(x[3:]), i['alliances']['blue']['team_keys'])), "red" : list(map(lambda x: int(x[3:]), i['alliances']['red']['team_keys'])), "winner": i["winning_alliance"]})
return out
def get_team_match_data(apikey, competition, team_num):
client = pymongo.MongoClient(apikey)
db = client.data_scouting
mdata = db.matchdata
out = {}
for i in mdata.find({"competition" : competition, "team_scouted": team_num}):
out[i['match']] = i['data']
return pd.DataFrame(out)
def get_team_pit_data(apikey, competition, team_num):
client = pymongo.MongoClient(apikey)
db = client.data_scouting
mdata = db.pitdata
out = {}
return mdata.find_one({"competition" : competition, "team_scouted": team_num})["data"]
def get_team_metrics_data(apikey, competition, team_num):
client = pymongo.MongoClient(apikey)
db = client.data_processing
mdata = db.team_metrics
return mdata.find_one({"competition" : competition, "team": team_num})
def get_match_data_formatted(apikey, competition):
client = pymongo.MongoClient(apikey)
db = client.data_scouting
mdata = db.teamlist
x=mdata.find_one({"competition":competition})
out = {}
for i in x:
try:
out[int(i)] = unkeyify_2l(get_team_match_data(apikey, competition, int(i)).transpose().to_dict())
except:
pass
return out
def get_metrics_data_formatted(apikey, competition):
client = pymongo.MongoClient(apikey)
db = client.data_scouting
mdata = db.teamlist
x=mdata.find_one({"competition":competition})
out = {}
for i in x:
try:
out[int(i)] = d.get_team_metrics_data(apikey, competition, int(i))
except:
pass
return out
def get_pit_data_formatted(apikey, competition):
client = pymongo.MongoClient(apikey)
db = client.data_scouting
mdata = db.teamlist
x=mdata.find_one({"competition":competition})
out = {}
for i in x:
try:
out[int(i)] = get_team_pit_data(apikey, competition, int(i))
except:
pass
return out
def get_pit_variable_data(apikey, competition):
client = pymongo.MongoClient(apikey)
db = client.data_processing
mdata = db.team_pit
out = {}
return mdata.find()
def get_pit_variable_formatted(apikey, competition):
temp = get_pit_variable_data(apikey, competition)
out = {}
for i in temp:
out[i["variable"]] = i["data"]
return out
def push_team_tests_data(apikey, competition, team_num, data, dbname = "data_processing", colname = "team_tests"):
client = pymongo.MongoClient(apikey)
db = client[dbname]
mdata = db[colname]
mdata.replace_one({"competition" : competition, "team": team_num}, {"_id": competition+str(team_num)+"am", "competition" : competition, "team" : team_num, "data" : data}, True)
def push_team_metrics_data(apikey, competition, team_num, data, dbname = "data_processing", colname = "team_metrics"):
client = pymongo.MongoClient(apikey)
db = client[dbname]
mdata = db[colname]
mdata.replace_one({"competition" : competition, "team": team_num}, {"_id": competition+str(team_num)+"am", "competition" : competition, "team" : team_num, "metrics" : data}, True)
def push_team_pit_data(apikey, competition, variable, data, dbname = "data_processing", colname = "team_pit"):
client = pymongo.MongoClient(apikey)
db = client[dbname]
mdata = db[colname]
mdata.replace_one({"competition" : competition, "variable": variable}, {"competition" : competition, "variable" : variable, "data" : data}, True)
def get_analysis_flags(apikey, flag):
client = pymongo.MongoClient(apikey)
db = client.data_processing
mdata = db.flags
return mdata.find_one({flag:{"$exists":True}})
def set_analysis_flags(apikey, flag, data):
client = pymongo.MongoClient(apikey)
db = client.data_processing
mdata = db.flags
return mdata.replace_one({flag:{"$exists":True}}, data, True)
def unkeyify_2l(layered_dict):
out = {}
for i in layered_dict.keys():
add = []
sortkey = []
for j in layered_dict[i].keys():
add.append([j,layered_dict[i][j]])
add.sort(key = lambda x: x[0])
out[i] = list(map(lambda x: x[1], add))
return out

View File

@ -1,4 +0,0 @@
requests
pymongo
pandas
tra-analysis

View File

@ -1,536 +0,0 @@
# Titan Robotics Team 2022: Superscript Script
# Written by Arthur Lu, Jacob Levine, and Dev Singh
# Notes:
# setup:
__version__ = "0.8.2"
# changelog should be viewed using print(analysis.__changelog__)
__changelog__ = """changelog:
0.8.2:
- readded while true to main function
- added more thread config options
0.8.1:
- optimized matchloop further by bypassing GIL
0.8.0:
- added multithreading to matchloop
- tweaked user log
0.7.0:
- finished implementing main function
0.6.2:
- integrated get_team_rankings.py as get_team_metrics() function
- integrated visualize_pit.py as graph_pit_histogram() function
0.6.1:
- bug fixes with analysis.Metric() calls
- modified metric functions to use config.json defined default values
0.6.0:
- removed main function
- changed load_config function
- added save_config function
- added load_match function
- renamed simpleloop to matchloop
- moved simplestats function inside matchloop
- renamed load_metrics to load_metric
- renamed metricsloop to metricloop
- split push to database functions amon push_match, push_metric, push_pit
- moved
0.5.2:
- made changes due to refactoring of analysis
0.5.1:
- text fixes
- removed matplotlib requirement
0.5.0:
- improved user interface
0.4.2:
- removed unessasary code
0.4.1:
- fixed bug where X range for regression was determined before sanitization
- better sanitized data
0.4.0:
- fixed spelling issue in __changelog__
- addressed nan bug in regression
- fixed errors on line 335 with metrics calling incorrect key "glicko2"
- fixed errors in metrics computing
0.3.0:
- added analysis to pit data
0.2.1:
- minor stability patches
- implemented db syncing for timestamps
- fixed bugs
0.2.0:
- finalized testing and small fixes
0.1.4:
- finished metrics implement, trueskill is bugged
0.1.3:
- working
0.1.2:
- started implement of metrics
0.1.1:
- cleaned up imports
0.1.0:
- tested working, can push to database
0.0.9:
- tested working
- prints out stats for the time being, will push to database later
0.0.8:
- added data import
- removed tba import
- finished main method
0.0.7:
- added load_config
- optimized simpleloop for readibility
- added __all__ entries
- added simplestats engine
- pending testing
0.0.6:
- fixes
0.0.5:
- imported pickle
- created custom database object
0.0.4:
- fixed simpleloop to actually return a vector
0.0.3:
- added metricsloop which is unfinished
0.0.2:
- added simpleloop which is untested until data is provided
0.0.1:
- created script
- added analysis, numba, numpy imports
"""
__author__ = (
"Arthur Lu <learthurgo@gmail.com>",
"Jacob Levine <jlevine@imsa.edu>",
)
__all__ = [
"load_config",
"save_config",
"get_previous_time",
"load_match",
"matchloop",
"load_metric",
"metricloop",
"load_pit",
"pitloop",
"push_match",
"push_metric",
"push_pit",
]
# imports:
from tra_analysis import analysis as an
import data as d
from collections import defaultdict
import json
import math
import numpy as np
import os
from os import system, name
from pathlib import Path
from multiprocessing import Pool
import matplotlib.pyplot as plt
from concurrent.futures import ThreadPoolExecutor
import time
import warnings
global exec_threads
def main():
global exec_threads
warnings.filterwarnings("ignore")
while (True):
current_time = time.time()
print("[OK] time: " + str(current_time))
config = load_config("config.json")
competition = config["competition"]
match_tests = config["statistics"]["match"]
pit_tests = config["statistics"]["pit"]
metrics_tests = config["statistics"]["metric"]
print("[OK] configs loaded")
print("[OK] starting threads")
cfg_max_threads = config["max-threads"]
sys_max_threads = os.cpu_count()
if cfg_max_threads > -sys_max_threads and cfg_max_threads < 0 :
alloc_processes = sys_max_threads + cfg_max_threads
elif cfg_max_threads > 0 and cfg_max_threads < 1:
alloc_processes = math.floor(cfg_max_threads * sys_max_threads)
elif cfg_max_threads > 1 and cfg_max_threads <= sys_max_threads:
alloc_processes = cfg_max_threads
elif cfg_max_threads == 0:
alloc_processes = sys_max_threads
else:
print("[Err] Invalid number of processes, must be between -" + str(sys_max_threads) + " and " + str(sys_max_threads))
exit()
exec_threads = Pool(processes = alloc_processes)
print("[OK] " + str(alloc_processes) + " threads started")
apikey = config["key"]["database"]
tbakey = config["key"]["tba"]
print("[OK] loaded keys")
previous_time = get_previous_time(apikey)
print("[OK] analysis backtimed to: " + str(previous_time))
print("[OK] loading data")
start = time.time()
match_data = load_match(apikey, competition)
pit_data = load_pit(apikey, competition)
print("[OK] loaded data in " + str(time.time() - start) + " seconds")
print("[OK] running match stats")
start = time.time()
matchloop(apikey, competition, match_data, match_tests)
print("[OK] finished match stats in " + str(time.time() - start) + " seconds")
print("[OK] running team metrics")
start = time.time()
metricloop(tbakey, apikey, competition, previous_time, metrics_tests)
print("[OK] finished team metrics in " + str(time.time() - start) + " seconds")
print("[OK] running pit analysis")
start = time.time()
pitloop(apikey, competition, pit_data, pit_tests)
print("[OK] finished pit analysis in " + str(time.time() - start) + " seconds")
set_current_time(apikey, current_time)
print("[OK] finished all tests, looping")
clear()
def clear():
# for windows
if name == 'nt':
_ = system('cls')
# for mac and linux(here, os.name is 'posix')
else:
_ = system('clear')
def load_config(file):
config_vector = {}
with open(file) as f:
config_vector = json.load(f)
return config_vector
def save_config(file, config_vector):
with open(file) as f:
json.dump(config_vector, f)
def get_previous_time(apikey):
previous_time = d.get_analysis_flags(apikey, "latest_update")
if previous_time == None:
d.set_analysis_flags(apikey, "latest_update", 0)
previous_time = 0
else:
previous_time = previous_time["latest_update"]
return previous_time
def set_current_time(apikey, current_time):
d.set_analysis_flags(apikey, "latest_update", {"latest_update":current_time})
def load_match(apikey, competition):
return d.get_match_data_formatted(apikey, competition)
def simplestats(data_test):
data = np.array(data_test[0])
data = data[np.isfinite(data)]
ranges = list(range(len(data)))
test = data_test[1]
if test == "basic_stats":
return an.basic_stats(data)
if test == "historical_analysis":
return an.histo_analysis([ranges, data])
if test == "regression_linear":
return an.regression(ranges, data, ['lin'])
if test == "regression_logarithmic":
return an.regression(ranges, data, ['log'])
if test == "regression_exponential":
return an.regression(ranges, data, ['exp'])
if test == "regression_polynomial":
return an.regression(ranges, data, ['ply'])
if test == "regression_sigmoidal":
return an.regression(ranges, data, ['sig'])
def matchloop(apikey, competition, data, tests): # expects 3D array with [Team][Variable][Match]
global exec_threads
class AutoVivification(dict):
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
return_vector = {}
team_filtered = []
variable_filtered = []
variable_data = []
test_filtered = []
result_filtered = []
return_vector = AutoVivification()
for team in data:
for variable in data[team]:
if variable in tests:
for test in tests[variable]:
team_filtered.append(team)
variable_filtered.append(variable)
variable_data.append((data[team][variable], test))
test_filtered.append(test)
result_filtered = exec_threads.map(simplestats, variable_data)
i = 0
result_filtered = list(result_filtered)
for result in result_filtered:
return_vector[team_filtered[i]][variable_filtered[i]][test_filtered[i]] = result
i += 1
push_match(apikey, competition, return_vector)
def load_metric(apikey, competition, match, group_name, metrics):
group = {}
for team in match[group_name]:
db_data = d.get_team_metrics_data(apikey, competition, team)
if d.get_team_metrics_data(apikey, competition, team) == None:
elo = {"score": metrics["elo"]["score"]}
gl2 = {"score": metrics["gl2"]["score"], "rd": metrics["gl2"]["rd"], "vol": metrics["gl2"]["vol"]}
ts = {"mu": metrics["ts"]["mu"], "sigma": metrics["ts"]["sigma"]}
group[team] = {"elo": elo, "gl2": gl2, "ts": ts}
else:
metrics = db_data["metrics"]
elo = metrics["elo"]
gl2 = metrics["gl2"]
ts = metrics["ts"]
group[team] = {"elo": elo, "gl2": gl2, "ts": ts}
return group
def metricloop(tbakey, apikey, competition, timestamp, metrics): # listener based metrics update
elo_N = metrics["elo"]["N"]
elo_K = metrics["elo"]["K"]
matches = d.pull_new_tba_matches(tbakey, competition, timestamp)
red = {}
blu = {}
for match in matches:
red = load_metric(apikey, competition, match, "red", metrics)
blu = load_metric(apikey, competition, match, "blue", metrics)
elo_red_total = 0
elo_blu_total = 0
gl2_red_score_total = 0
gl2_blu_score_total = 0
gl2_red_rd_total = 0
gl2_blu_rd_total = 0
gl2_red_vol_total = 0
gl2_blu_vol_total = 0
for team in red:
elo_red_total += red[team]["elo"]["score"]
gl2_red_score_total += red[team]["gl2"]["score"]
gl2_red_rd_total += red[team]["gl2"]["rd"]
gl2_red_vol_total += red[team]["gl2"]["vol"]
for team in blu:
elo_blu_total += blu[team]["elo"]["score"]
gl2_blu_score_total += blu[team]["gl2"]["score"]
gl2_blu_rd_total += blu[team]["gl2"]["rd"]
gl2_blu_vol_total += blu[team]["gl2"]["vol"]
red_elo = {"score": elo_red_total / len(red)}
blu_elo = {"score": elo_blu_total / len(blu)}
red_gl2 = {"score": gl2_red_score_total / len(red), "rd": gl2_red_rd_total / len(red), "vol": gl2_red_vol_total / len(red)}
blu_gl2 = {"score": gl2_blu_score_total / len(blu), "rd": gl2_blu_rd_total / len(blu), "vol": gl2_blu_vol_total / len(blu)}
if match["winner"] == "red":
observations = {"red": 1, "blu": 0}
elif match["winner"] == "blue":
observations = {"red": 0, "blu": 1}
else:
observations = {"red": 0.5, "blu": 0.5}
red_elo_delta = an.Metric().elo(red_elo["score"], blu_elo["score"], observations["red"], elo_N, elo_K) - red_elo["score"]
blu_elo_delta = an.Metric().elo(blu_elo["score"], red_elo["score"], observations["blu"], elo_N, elo_K) - blu_elo["score"]
new_red_gl2_score, new_red_gl2_rd, new_red_gl2_vol = an.Metric().glicko2(red_gl2["score"], red_gl2["rd"], red_gl2["vol"], [blu_gl2["score"]], [blu_gl2["rd"]], [observations["red"], observations["blu"]])
new_blu_gl2_score, new_blu_gl2_rd, new_blu_gl2_vol = an.Metric().glicko2(blu_gl2["score"], blu_gl2["rd"], blu_gl2["vol"], [red_gl2["score"]], [red_gl2["rd"]], [observations["blu"], observations["red"]])
red_gl2_delta = {"score": new_red_gl2_score - red_gl2["score"], "rd": new_red_gl2_rd - red_gl2["rd"], "vol": new_red_gl2_vol - red_gl2["vol"]}
blu_gl2_delta = {"score": new_blu_gl2_score - blu_gl2["score"], "rd": new_blu_gl2_rd - blu_gl2["rd"], "vol": new_blu_gl2_vol - blu_gl2["vol"]}
for team in red:
red[team]["elo"]["score"] = red[team]["elo"]["score"] + red_elo_delta
red[team]["gl2"]["score"] = red[team]["gl2"]["score"] + red_gl2_delta["score"]
red[team]["gl2"]["rd"] = red[team]["gl2"]["rd"] + red_gl2_delta["rd"]
red[team]["gl2"]["vol"] = red[team]["gl2"]["vol"] + red_gl2_delta["vol"]
for team in blu:
blu[team]["elo"]["score"] = blu[team]["elo"]["score"] + blu_elo_delta
blu[team]["gl2"]["score"] = blu[team]["gl2"]["score"] + blu_gl2_delta["score"]
blu[team]["gl2"]["rd"] = blu[team]["gl2"]["rd"] + blu_gl2_delta["rd"]
blu[team]["gl2"]["vol"] = blu[team]["gl2"]["vol"] + blu_gl2_delta["vol"]
temp_vector = {}
temp_vector.update(red)
temp_vector.update(blu)
push_metric(apikey, competition, temp_vector)
def load_pit(apikey, competition):
return d.get_pit_data_formatted(apikey, competition)
def pitloop(apikey, competition, pit, tests):
return_vector = {}
for team in pit:
for variable in pit[team]:
if variable in tests:
if not variable in return_vector:
return_vector[variable] = []
return_vector[variable].append(pit[team][variable])
push_pit(apikey, competition, return_vector)
def push_match(apikey, competition, results):
for team in results:
d.push_team_tests_data(apikey, competition, team, results[team])
def push_metric(apikey, competition, metric):
for team in metric:
d.push_team_metrics_data(apikey, competition, team, metric[team])
def push_pit(apikey, competition, pit):
for variable in pit:
d.push_team_pit_data(apikey, competition, variable, pit[variable])
def get_team_metrics(apikey, tbakey, competition):
metrics = d.get_metrics_data_formatted(apikey, competition)
elo = {}
gl2 = {}
for team in metrics:
elo[team] = metrics[team]["metrics"]["elo"]["score"]
gl2[team] = metrics[team]["metrics"]["gl2"]["score"]
elo = {k: v for k, v in sorted(elo.items(), key=lambda item: item[1])}
gl2 = {k: v for k, v in sorted(gl2.items(), key=lambda item: item[1])}
elo_ranked = []
for team in elo:
elo_ranked.append({"team": str(team), "elo": str(elo[team])})
gl2_ranked = []
for team in gl2:
gl2_ranked.append({"team": str(team), "gl2": str(gl2[team])})
return {"elo-ranks": elo_ranked, "glicko2-ranks": gl2_ranked}
def graph_pit_histogram(apikey, competition, figsize=(80,15)):
pit = d.get_pit_variable_formatted(apikey, competition)
fig, ax = plt.subplots(1, len(pit), sharey=True, figsize=figsize)
i = 0
for variable in pit:
ax[i].hist(pit[variable])
ax[i].invert_xaxis()
ax[i].set_xlabel('')
ax[i].set_ylabel('Frequency')
ax[i].set_title(variable)
plt.yticks(np.arange(len(pit[variable])))
i+=1
plt.show()
main()

View File

@ -1,2 +0,0 @@
def test_():
assert 1 == 1