Merge branch 'master-staged' of https://github.com/titanscouting/red-alliance-analysis into master-staged

This commit is contained in:
Arthur Lu 2020-10-06 19:17:35 -07:00
commit ab3585437c
9 changed files with 281 additions and 54 deletions

View File

@ -24,5 +24,5 @@
"ms-python.python",
"waderyan.gitblame"
],
"postCreateCommand": "apt install vim -y ; pip install -r data-analysis/requirements.txt ; pip install -r analysis-master/requirements.txt ; pip install tra-analysis"
"postCreateCommand": "apt install vim -y ; pip install -r data-analysis/requirements.txt ; pip install -r analysis-master/requirements.txt ; pip install pylint ; pip install tra-analysis"
}

View File

@ -1,37 +1,102 @@
# Red Alliance Analysis · ![GitHub release (latest by date)](https://img.shields.io/github/v/release/titanscout2022/red-alliance-analysis)
Titan Robotics 2022 Strategy Team Repository for Data Analysis Tools. Included with these tools are the backend data analysis engine formatted as a python package, associated binaries for the analysis package, and premade scripts that can be pulled directly from this repository and will integrate with other Red Alliance applications to quickly deploy FRC scouting tools.
# Getting Started
---
# `tra-analysis`
`tra-analysis` is a higher level package for data processing and analysis. It is a python library that combines popular data science tools like numpy, scipy, and sklearn along with other tools to create an easy-to-use data analysis engine. tra-analysis includes analysis in all ranges of complexity from basic statistics like mean, median, mode to complex kernel based classifiers and allows user to more quickly deploy these algorithms. The package also includes performance metrics for score based applications including elo, glicko2, and trueskill ranking systems.
At the core of the tra-analysis package is the modularity of each analytical tool. The package encapsulates the setup code for the included data science tools. For example, there are many packages that allow users to generate many different types of regressions. With the tra-analysis package, one function can be called to generate many regressions and sort them by accuracy.
## Prerequisites
---
* Python >= 3.6
* Pip which can be installed by running\
`curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py`\
`python get-pip.py`\
after installing python, or with a package manager on linux. Refer to the [pip installation instructions](https://pip.pypa.io/en/stable/installing/) for more information.
## Installing
### Standard Platforms
---
#### Standard Platforms
For the latest version of tra-analysis, run `pip install tra-analysis` or `pip install tra_analysis`. The requirements for tra-analysis should be automatically installed.
### Exotic Platforms (Android)
#### Exotic Platforms (Android)
[Termux](https://termux.com/) is recommended for a linux environemnt on Android. Consult the [documentation](https://titanscouting.github.io/analysis/general/installation#exotic-platforms-android) for advice on installing the prerequisites. After installing the prerequisites, the package should be installed normally with `pip install tra-analysis` or `pip install tra_analysis`.
## Use
---
tra-analysis operates like any other python package. Consult the [documentation](https://titanscouting.github.io/analysis/tra_analysis/) for more information.
# Supported Platforms
## Supported Platforms
---
Although any modern 64 bit platform should be supported, the following platforms have been tested to be working:
* AMD64 (Tested on Zen, Zen+, and Zen 2)
* Intel 64/x86_64/x64 (Tested on Kaby Lake, Ice Lake)
* ARM64 (Tested on Broadcom BCM2836 SoC, Broadcom BCM2711 SoC)
###
The following OSes have been tested to be working:
* Linux Kernel 3.16, 4.4, 4.15, 4.19, 5.4
* Ubuntu 16.04, 18.04, 20.04
* Debian (and Debian derivaives) Jessie, Buster
* Windows 7, 10
###
The following python versions are supported:
* python 3.6 (not tested)
* python 3.7
* python 3.8
---
# `data-analysis`
To facilitate data analysis of collected scouting data in a user firendly tool, we created the data-analysis application. At its core it uses the tra-analysis package to conduct any number of user selected tests on data collected from the TRA scouting app. It uploads these tests back to MongoDB where it can be viewed from the app at any time.
The data-analysis application also uses the TRA API to interface with MongoDB and uses the TBA API to collect additional data (match win/loss).
The application can be configured with a configuration tool or by editing the config.json directly.
## Prerequisites
---
Before installing and using data-analysis, make sure that you have installed the folowing prerequisites:
- A common operating system like **Windows** or (*most*) distributions of **Linux**. BSD may work but has not been tested nor is it reccomended.
- [Python](https://www.python.org/) version **3.6** or higher
- [Pip](https://pip.pypa.io/en/stable/) (installation instructions [here](https://pip.pypa.io/en/stable/installing/))
## Installing Requirements
---
Once navigated to the data-analysis folder run `pip install -r requirements.txt` to install all of the required python libraries.
## Scripts
---
The data-analysis application is a collection of various scripts and one config file. For users, only the main application `superscript.py` and the config file `config.json` are important.
To run the data-analysis application, navigate to the data-analysis folder once all requirements have been installed and run `python superscript.py`. If you encounter the error:
`pymongo.errors.ConfigurationError: Empty host (or extra comma in host list).`
don't worry, you may have just not configured the application correctly, but would otherwise work. Refer to [the documentation](https://titanscouting.github.io/analysis/data_analysis/Config) to learn how to configure data-analysis.
# Contributing
Read our included contributing guidelines (`CONTRIBUTING.md`) for more information and feel free to reach out to any current maintainer for more information.
# Build Statuses
![Analysis Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Analysis%20Unit%20Tests/badge.svg)
![Superscript Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Superscript%20Unit%20Tests/badge.svg?branch=master)
![Superscript Unit Tests](https://github.com/titanscout2022/red-alliance-analysis/workflows/Superscript%20Unit%20Tests/badge.svg?branch=master)

6
SECURITY.md Normal file
View File

@ -0,0 +1,6 @@
# Security Policy
## Reporting a Vulnerability
Please email `titanscout2022@gmail.com` to report a vulnerability.

View File

@ -1,8 +1,11 @@
from tra_analysis import analysis as an
from tra_analysis import metrics
from tra_analysis import fits
def test_():
test_data_linear = [1, 3, 6, 7, 9]
x_data_circular = []
y_data_circular = []
y_data_ccu = [1, 3, 7, 14, 21]
y_data_ccd = [1, 5, 7, 8.5, 8.66]
test_data_scrambled = [-32, 34, 19, 72, -65, -11, -43, 6, 85, -17, -98, -26, 12, 20, 9, -92, -40, 98, -78, 17, -20, 49, 93, -27, -24, -66, 40, 84, 1, -64, -68, -25, -42, -46, -76, 43, -3, 30, -14, -34, -55, -13, 41, -30, 0, -61, 48, 23, 60, 87, 80, 77, 53, 73, 79, 24, -52, 82, 8, -44, 65, 47, -77, 94, 7, 37, -79, 36, -94, 91, 59, 10, 97, -38, -67, 83, 54, 31, -95, -63, 16, -45, 21, -12, 66, -48, -18, -96, -90, -21, -83, -74, 39, 64, 69, -97, 13, 55, 27, -39]
@ -28,4 +31,5 @@ def test_():
assert all(a == b for a, b in zip(an.Sort().shellsort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().bubblesort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().cyclesort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().cocktailsort(test_data_scrambled), test_data_sorted))
assert all(a == b for a, b in zip(an.Sort().cocktailsort(test_data_scrambled), test_data_sorted))
assert fits.CircleFit(x=[0,0,-1,1], y=[1, -1, 0, 0]).LSC() == (0.0, 0.0, 1.0, 0.0)

View File

@ -0,0 +1,85 @@
# Titan Robotics Team 2022: CPU fitting models
# Written by Dev Singh
# Notes:
# this module is cuda-optimized (as appropriate) and vectorized (except for one small part)
# setup:
__version__ = "0.0.1"
# changelog should be viewed using print(analysis.fits.__changelog__)
__changelog__ = """changelog:
0.0.1:
- initial release, add circle fitting with LSC
"""
__author__ = (
"Dev Singh <dev@devksingh.com>"
)
__all__ = [
'CircleFit'
]
import numpy as np
class CircleFit:
"""Class to fit data to a circle using the Least Square Circle (LSC) method"""
# For more information on the LSC method, see:
# http://www.dtcenter.org/sites/default/files/community-code/met/docs/write-ups/circle_fit.pdf
def __init__(self, x, y, xy=None):
self.ournp = np #todo: implement cupy correctly
if type(x) == list:
x = np.array(x)
if type(y) == list:
y = np.array(y)
if type(xy) == list:
xy = np.array(xy)
if xy != None:
self.coords = xy
else:
# following block combines x and y into one array if not already done
self.coords = self.ournp.vstack(([x.T], [y.T])).T
def calc_R(x, y, xc, yc):
"""Returns distance between center and point"""
return self.ournp.sqrt((x-xc)**2 + (y-yc)**2)
def f(c, x, y):
"""Returns distance between point and circle at c"""
Ri = calc_R(x, y, *c)
return Ri - Ri.mean()
def LSC(self):
"""Fits given data to a circle and returns the center, radius, and variance"""
x = self.coords[:, 0]
y = self.coords[:, 1]
# guessing at a center
x_m = self.ournp.mean(x)
y_m = self.ournp.mean(y)
# calculation of the reduced coordinates
u = x - x_m
v = y - y_m
# linear system defining the center (uc, vc) in reduced coordinates:
# Suu * uc + Suv * vc = (Suuu + Suvv)/2
# Suv * uc + Svv * vc = (Suuv + Svvv)/2
Suv = self.ournp.sum(u*v)
Suu = self.ournp.sum(u**2)
Svv = self.ournp.sum(v**2)
Suuv = self.ournp.sum(u**2 * v)
Suvv = self.ournp.sum(u * v**2)
Suuu = self.ournp.sum(u**3)
Svvv = self.ournp.sum(v**3)
# Solving the linear system
A = self.ournp.array([ [ Suu, Suv ], [Suv, Svv]])
B = self.ournp.array([ Suuu + Suvv, Svvv + Suuv ])/2.0
uc, vc = self.ournp.linalg.solve(A, B)
xc_1 = x_m + uc
yc_1 = y_m + vc
# Calculate the distances from center (xc_1, yc_1)
Ri_1 = self.ournp.sqrt((x-xc_1)**2 + (y-yc_1)**2)
R_1 = self.ournp.mean(Ri_1)
# calculate residual error
residu_1 = self.ournp.sum((Ri_1-R_1)**2)
return (xc_1, yc_1, R_1, residu_1)

View File

@ -1,8 +1,9 @@
# Titan Robotics Team 2022: CUDA-based Regressions Module
# Not actively maintained, may be removed in future release
# Written by Arthur Lu & Jacob Levine
# Notes:
# this module has been automatically inegrated into analysis.py, and should be callable as a class from the package
# this module is cuda-optimized and vectorized (except for one small part)
# this module is cuda-optimized (as appropriate) and vectorized (except for one small part)
# setup:
__version__ = "0.0.4"
@ -25,7 +26,7 @@ __changelog__ = """
__author__ = (
"Jacob Levine <jlevine@imsa.edu>",
"Arthur Lu <learthurgo@gmail.com>"
"Arthur Lu <learthurgo@gmail.com>",
)
__all__ = [
@ -40,14 +41,15 @@ __all__ = [
'ExpRegKernel',
'SigmoidalRegKernelArthur',
'SGDTrain',
'CustomTrain'
'CustomTrain',
'CircleFit'
]
import torch
global device
device = "cuda:0" if torch.torch.cuda.is_available() else "cpu"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
#todo: document completely
@ -217,4 +219,4 @@ def CustomTrain(self, kernel, optim, data, ground, loss=torch.nn.MSELoss(), iter
ls=loss(pred,ground_cuda)
ls.backward()
optim.step()
return kernel
return kernel

View File

@ -1,6 +1,7 @@
{
"max-threads": 0.5,
"team": "",
"competition": "2020ilch",
"competition": "",
"key":{
"database":"",
"tba":""

View File

@ -1,4 +1,4 @@
requests
pymongo
pandas
dnspython
tra-analysis

View File

@ -3,10 +3,18 @@
# Notes:
# setup:
__version__ = "0.7.0"
__version__ = "0.8.2"
# changelog should be viewed using print(analysis.__changelog__)
__changelog__ = """changelog:
0.8.2:
- readded while true to main function
- added more thread config options
0.8.1:
- optimized matchloop further by bypassing GIL
0.8.0:
- added multithreading to matchloop
- tweaked user log
0.7.0:
- finished implementing main function
0.6.2:
@ -114,16 +122,25 @@ __all__ = [
from tra_analysis import analysis as an
import data as d
from collections import defaultdict
import json
import math
import numpy as np
import os
from os import system, name
from pathlib import Path
from multiprocessing import Pool
import matplotlib.pyplot as plt
from concurrent.futures import ThreadPoolExecutor
import time
import warnings
global exec_threads
def main():
global exec_threads
warnings.filterwarnings("ignore")
while (True):
@ -138,6 +155,23 @@ def main():
metrics_tests = config["statistics"]["metric"]
print("[OK] configs loaded")
print("[OK] starting threads")
cfg_max_threads = config["max-threads"]
sys_max_threads = os.cpu_count()
if cfg_max_threads > -sys_max_threads and cfg_max_threads < 0 :
alloc_processes = sys_max_threads + cfg_max_threads
elif cfg_max_threads > 0 and cfg_max_threads < 1:
alloc_processes = math.floor(cfg_max_threads * sys_max_threads)
elif cfg_max_threads > 1 and cfg_max_threads <= sys_max_threads:
alloc_processes = cfg_max_threads
elif cfg_max_threads == 0:
alloc_processes = sys_max_threads
else:
print("[Err] Invalid number of processes, must be between -" + str(sys_max_threads) + " and " + str(sys_max_threads))
exit()
exec_threads = Pool(processes = alloc_processes)
print("[OK] " + str(alloc_processes) + " threads started")
apikey = config["key"]["database"]
tbakey = config["key"]["tba"]
print("[OK] loaded keys")
@ -151,15 +185,15 @@ def main():
pit_data = load_pit(apikey, competition)
print("[OK] loaded data in " + str(time.time() - start) + " seconds")
print("[OK] running tests")
print("[OK] running match stats")
start = time.time()
matchloop(apikey, competition, match_data, match_tests)
print("[OK] finished tests in " + str(time.time() - start) + " seconds")
print("[OK] finished match stats in " + str(time.time() - start) + " seconds")
print("[OK] running metrics")
print("[OK] running team metrics")
start = time.time()
metricloop(tbakey, apikey, competition, previous_time, metrics_tests)
print("[OK] finished metrics in " + str(time.time() - start) + " seconds")
print("[OK] finished team metrics in " + str(time.time() - start) + " seconds")
print("[OK] running pit analysis")
start = time.time()
@ -217,48 +251,78 @@ def load_match(apikey, competition):
return d.get_match_data_formatted(apikey, competition)
def simplestats(data_test):
data = np.array(data_test[0])
data = data[np.isfinite(data)]
ranges = list(range(len(data)))
test = data_test[1]
if test == "basic_stats":
return an.basic_stats(data)
if test == "historical_analysis":
return an.histo_analysis([ranges, data])
if test == "regression_linear":
return an.regression(ranges, data, ['lin'])
if test == "regression_logarithmic":
return an.regression(ranges, data, ['log'])
if test == "regression_exponential":
return an.regression(ranges, data, ['exp'])
if test == "regression_polynomial":
return an.regression(ranges, data, ['ply'])
if test == "regression_sigmoidal":
return an.regression(ranges, data, ['sig'])
def matchloop(apikey, competition, data, tests): # expects 3D array with [Team][Variable][Match]
def simplestats(data, test):
global exec_threads
data = np.array(data)
data = data[np.isfinite(data)]
ranges = list(range(len(data)))
if test == "basic_stats":
return an.basic_stats(data)
if test == "historical_analysis":
return an.histo_analysis([ranges, data])
if test == "regression_linear":
return an.regression(ranges, data, ['lin'])
if test == "regression_logarithmic":
return an.regression(ranges, data, ['log'])
if test == "regression_exponential":
return an.regression(ranges, data, ['exp'])
if test == "regression_polynomial":
return an.regression(ranges, data, ['ply'])
if test == "regression_sigmoidal":
return an.regression(ranges, data, ['sig'])
class AutoVivification(dict):
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
return_vector = {}
team_filtered = []
variable_filtered = []
variable_data = []
test_filtered = []
result_filtered = []
return_vector = AutoVivification()
for team in data:
variable_vector = {}
for variable in data[team]:
test_vector = {}
variable_data = data[team][variable]
if variable in tests:
for test in tests[variable]:
test_vector[test] = simplestats(variable_data, test)
else:
pass
variable_vector[variable] = test_vector
return_vector[team] = variable_vector
team_filtered.append(team)
variable_filtered.append(variable)
variable_data.append((data[team][variable], test))
test_filtered.append(test)
result_filtered = exec_threads.map(simplestats, variable_data)
i = 0
result_filtered = list(result_filtered)
for result in result_filtered:
return_vector[team_filtered[i]][variable_filtered[i]][test_filtered[i]] = result
i += 1
push_match(apikey, competition, return_vector)