-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsearch.json
121 lines (121 loc) · 37.7 KB
/
search.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
[
{
"objectID": "machine_learning.html",
"href": "machine_learning.html",
"title": "Machine Learning",
"section": "",
"text": "Most of the problems that are solved today with Machine Learning can be divided into two categories: supervised and unsupervised learning.\nThe main difference between these two categories is that in supervised learning, for each observation of the predictor variables \\((x_i)\\), there is a measure of the response variable \\((y_i)\\). That is, we know what the response is for previous examples and we want to predict future observations based on previous learned data. Supervised learning can be applied to regression or classification problems. The difference is in the type of variable being predicted. In regression, it is a continuous numerical variable, and in classification, it is a categorical variable. It can be used to identify risk factors in diseases such as cancer or to predict heart problems based on diet, clinical measures, and demographics.\nUnsupervised learning or clustering divides or segments the input data space into similar groups. Its goal is to find groups with similar characteristics, but we do not have prior knowledge. For example, it is used to segment clients into groups with similar patterns, detect anomalous behavior by identifying patterns that fall outside the usual clusters, or simplify or summarize very large datasets by grouping similar users."
},
{
"objectID": "about.html",
"href": "about.html",
"title": "About",
"section": "",
"text": "Hi there! My name is Carmen Gómez Valenzuela and I am a Data Scientist. This website serves as both my portfolio and personal notepad where I share my insights and ideas about Data Analysis, Data Science, Artificial Intelligence, and related fields.\nIn addition, I also run a blog where I share interesting discoveries from around the world.\nPlease note that this site, much like life, is constantly “under construction”. If you are interested in Data or Artificial Intelligence, or would like to discuss potential collaboration opportunities, feel free to contact me on LinkedIn.\nBest regards!"
},
{
"objectID": "asignatura_aprendizaje.html",
"href": "asignatura_aprendizaje.html",
"title": "Aprendizaje automático",
"section": "",
"text": "Introducción\nLa mayoría de los problemas que se resuelven hoy en día con Machine Learning se pueden dividir entre problemas de aprendizaje supervisado y no supervisado.\nLa principal diferencia entre estas dos categorías, es que en el aprendizae supervisado, para cada observación de las variables predictoras \\((x_i)\\) existe una medida de la variable respuesta \\((y_i)\\). Es decir, conocemos para los ejemplos previos cuál es la respueta. Se desea predecir futuras observaciones en función de datos previos ya aprendidos. El aprendizaje supervisado se puede aplicar a problemas de regresión o clasificación. La diferencia es el tipo de la variable a predecir. En regresión es una variable numérica continua y en clasificación es una variable categórica. Se puede utilizar para identificar fatores de riesgo en enfermedades como por ejempl, el cáncer, o bien para predecir problemas de corazón en función de la dieta, medidas clínicas y demografía.\nEl aprendizaje no supervisado o clustering divide o segmenta el espacio de los datos de entrada en grupos similares. Su objetivo es buscar grupos con características similares pero no disponemos de conocimiento previo. Por ejemplo, se utliza para segmentar clientes entre gupos con patrones similares, detectar comportamiento anómalo identificando patrones que caen fuera de los clusters habituales o simplificar o resumir datasets muyh grandes, agrupando usuarios similares.\n\n\nEvaluación de algoritmos de regresión\nEl objetivo de los algoritmos de regresión es predecir el valor de una variable numérica continua.\nUna forma de evaluar este algoritmo es con el error cuadrático medio (Root Mean Square Error-RMSE). Siendo a la observación y p la predicción \\[RMSE = \\sqrt{\\frac{\\displaystyle\\sum_{i=1}^n{(p_i-a_i)^2}}{n}}\\]\nBásicamente, son las diferencias al cuadrado entre el valor predicho y el valor real. Se espera que la dsitribución de estas diferencias tenga una distribución normal.\nOtra forma de evaluar el algoritmo es con el Mean Absolute Error (MAE): \\[MAE = \\frac{\\displaystyle\\sum_{i=1}^n{|p_i-a_i|}}{n}\\]\nEn este caso es la diferencia en valor absoluta entre la predicción y el valor real.\nTambién se puede usar el coeficiente de determinación \\(R^2\\) que nos dice como de similares son los valores predichos y los valores estimados.\n\n\nEvaluación de los algoritmos de clasificación\nLa clave del modelo predictivo es generalizar sobre los datos del futuro.\nUn modelo puede tener un error muy bajo en el conjunto de entrenamiento y comportarse mal con datos futuros. Por esta razón se suelen usar los datos de test.\nLa técnica de Hold-out divide los datos en train y test. La repetición de este método de hold-out se conoce con el nombre de cross-validation. El límite se conoce como leave-one-out cross validation: se entrena con todas las instancias (99%) menos una (1%). El problema que tiene esta técnica es que es computacionalmente muy costosa. Las pruebas empíricas reportan que no hay mucho beneficio por encima de utilizar más de 10 folds.\nOtra forma de evaluar la calidad de los algoritmos de clasificación es utilizar la matriz de confusión: es una tabla que organiza las predicciones en función de los valores reales de los datos.\n\nOtra forma de evaluar la clasificación es tener en cuenta lo que se conoce como precision y accuracy. En el caso de precision nos da información sobre los errores aleatorios, es una medida de variablidad estadística (variance). Accuracy es una medida del sesgo estadístico o erroers sistemáticos (bias)\n\nOtra métrica común es la curva ROC o área bajo la curva (AUC). Va desde 0.5 para un clasificador sin potencia predictiva hasta 1. Muestra el ratio entre los verdaderos positivos (true positive) y los falsos positivos."
},
{
"objectID": "blog.html",
"href": "blog.html",
"title": "Blog",
"section": "",
"text": "The Hawthorne Effect: How Being Watched Can Affect Your Behavior and Statistical Studies\n\n\n\n\n\n\n\nstatistics\n\n\n\n\n\n\n\n\n\n\n\nMay 10, 2023\n\n\nCarmen Gómez Valenzuela\n\n\n\n\n\n\n \n\n\n\n\nWelcome To My Blog\n\n\n\n\n\n\n\nnews\n\n\n\n\n\n\n\n\n\n\n\nMay 7, 2023\n\n\nTristan O’Malley\n\n\n\n\n\n\nNo matching items"
},
{
"objectID": "multivariate_analysis/1-Introduction.html",
"href": "multivariate_analysis/1-Introduction.html",
"title": "1. Introduction.",
"section": "",
"text": "Statistics cannot be done without data, and in multivariate statistics, data is primarily presented in matrices, where we usually place the objects studied in the rows and the variables in the columns.\nOnce we have the data, we are interested in providing a good description of it, which can be done numerically or graphically.\nMatrix algebra and matrix calculus are fundamental to multivariate analysis.\nDescribing multivariate data involves specifying the data matrix, providing measures of central tendency and measures of dispersion, specifically variance-covariance matrices or global measures of variability.\nAnother topic that plays a significant role in multivariate analysis is the concept of distance: distances between objects, between variables, etc. Distances are the core foundation for applying multivariate methods such as cluster analysis or correspondence analysis.\nIn terms of graphical representation, aside from classic graphical representations like histograms or scatterplots, we will explore how to create multiple box plots and address the delicate issue of outliers, how to identify them and mitigate their negative effects.\nFinally, regarding matrix algebra and matrix calculus, it is necessary to understand how matrices are added, multiplied, what the diagonal is, the trace of a matrix or determinant, and most importantly, their properties. In particular, diagonalization of a matrix and singular value decomposition are two concepts that are absolutely necessary. Essentially, this involves a factorization of the matrix.\n\nReferences\n\n\nEveritt, B. S. 2005. An r and s-PLUS Companion to Multivariate Analysis. Springer Texts in Statistics. London: Springer.\n\n\nManly, Bryan FJ, and Jorge A Navarro Alberto. 2016. Multivariate Statistical Methods: A Primer. 4th ed. New York: Chapman; Hall/CRC. https://doi.org/10.1201/9781315382135."
},
{
"objectID": "multivariate_analysis/2-Principal_components_analysis.html",
"href": "multivariate_analysis/2-Principal_components_analysis.html",
"title": "1. Introduction.",
"section": "",
"text": "El análisis de componentes principales es sin duda la técnica más utilizada en análisis multivariante y en campos relacionados. Se ha aplicado a la ecología, la economía, la ciencia de materiales, o al análisis de datos ómicos.\nUna de las principales aplicaciones del análisis de componentes principales es para buscar lo que llamamos variables o factores latentes en los datos. Es decir, variables que no son evidentes, que no podemos observar, pero que están allí y explican distintas características de los datos. Por ejemplo, estudio en el que se recogieron 49 gorriones enfermos moribundos"
},
{
"objectID": "multivariate_analysis/pca.html",
"href": "multivariate_analysis/pca.html",
"title": "Principal Component Analysis",
"section": "",
"text": "PCA Algorithm for Feature Extraction\nThe following represents 6 steps of principal component analysis (PCA) algorithm:\nStandardize the dataset: Standardizing / normalizing the dataset is the first step one would need to take before performing PCA. The PCA calculates a new projection of the given data set representing one or more features. The new axes are based on the standard deviation of the value of these features. So, a feature / variable with a high standard deviation will have a higher weight for the calculation of axis than a variable / feature with a low standard deviation. If the data is normalized / standardized, the standard deviation of all fetaures / variables get measured on the same scale. Thus, all variables have the same weight and PCA calculates relevant axis appropriately. Note that the data is standardized / normalized after creating training / test split. Python’s sklearn.preprocessing StandardScaler class can be used for standardizing the dataset. Construct the covariance matrix: Once the data is standardized, the next step is to create n X n-dimensional covariance matrix, where n is the number of dimensions in the dataset. The covariance matrix stores the pairwise covariances between the different features. Note that a positive covariance between two features indicates that the features increase or decrease together, whereas a negative covariance indicates that the features vary in opposite directions. Python‘s Numpy cov method can be used to create covariance matrix. Perform Eigendecomposition of covariance matrix: The next step is to decompose the covariance matrix into its eigenvectors and eigenvalues. The eigenvectors of the covariance matrix represent the principal components (the directions of maximum variance), whereas the corresponding eigenvalues will define their magnitude. Numpy linalg.eig or linalg.eigh can be used for decomposing covariance matrix into eigenvectors and eigenvalues. Selection of most important Eigenvectors / Eigenvalues: Sort the eigenvalues by decreasing order to rank the corresponding eigenvectors. Select k eigenvectors, which correspond to the k largest eigenvalues, where k is the dimensionality of the new feature subspace (). One can used the concepts of explained variance to select the k most important eigenvectors. Projection matrix creation of important eigenvectors: Construct a projection matrix, W, from the top k eigenvectors. Training / test dataset transformation: Finally, transform the d-dimensional input training and test dataset using the projection matrix to obtain the new k-dimensional feature subspace."
},
{
"objectID": "multivariate_analysis.html",
"href": "multivariate_analysis.html",
"title": "Multivariate Analyisis",
"section": "",
"text": "Welcome to my notes on multivariate statistical analysis! As a data scientist, I have found that understanding and applying multivariate statistical techniques is essential for extracting insights from complex datasets. These notes are my personal collection of key concepts, techniques, and practical applications that I have learned and used over the years.\nMy goal in creating these notes is to provide a useful resource for myself to review and reference in the future, as well as to share my knowledge with others who are interested in learning about multivariate statistical analysis. Whether you are just starting out in the field or have years of experience, I hope you will find these notes to be a valuable reference and a source of inspiration for your own work.\nThroughout these notes, I will cover a range of topics related to multivariate statistical analysis, including exploratory data analysis, principal component analysis, factor analysis, clustering, discriminant analysis, and more. I will provide clear explanations of key concepts, step-by-step instructions for implementing techniques in Python, and real-world examples to demonstrate how these techniques can be used to extract insights from complex datasets.\nI invite you to join me on this journey through the world of multivariate statistical analysis. Let’s explore together how these powerful techniques can be used to unlock the hidden patterns and relationships in data, and turn raw data into actionable insights.\nLastly, I would like to add that if you find any errors, please do not hesitate to contact me through LinkedIn. I would be very grateful for your feedback."
},
{
"objectID": "notepad/ai-planning.html",
"href": "notepad/ai-planning.html",
"title": "Artificial Intelligence Planning",
"section": "",
"text": "“Planning is the art and practice of thinking before acting: of reviewing the courses of action one has available and predicting their expected (and unexpected) results to be able to choose the course of action most beneficial with respect to one’s goals.” –Patrik Haslum–\n\nIntroduction\nBasic Planning Problem: given descriptions of possible initial states of the world, desired goals and a set of possible actions, synthesize a plan that is guaranteed to generate a state which contains the desired goals.\nIn artificial intelligence, planning refers to the process of generating a sequence of actions to achieve a specific goal or set of goals. Planning involves finding an optimal sequence of actions that can transform an initial state of the world into a desired goal state while satisfying any relevant constraints and optimizing some objective function.\nIn summary, AI Planning is the model-based approach to action selection: prouduces the behaviour from the model (solves the model)\n\n\nClassical Planning Model\nClassical planning model is a tuple of six elements: \\[\\mathcal{S} = \\langle S, s_0, S_G, A, f, c \\rangle \\] where:\n\n\nthe first is the finite set of states \\(S\\)\n\n\none of this states is called the initial state \\(s_0∈S\\)\n\n\nsome known non empty subset of these are called goal states: \\(S_G⊆S\\)\n\n\na final set of actions \\(A(s) ⊆ A\\) that if a subset of applicable actions defined for each state \\(s∈S\\)\n\n\nA deterministic transition function maps a state s and an applicable action a into a resulting state s’ \\[s' = f(a, s)~for~a∈A(s)\\]\n\n\nfinally each transition has associated a non-negative action cost \\(c(a, s)\\)\n\n\nSo solutions are sequences of applicable actions that start at the initial state \\(s_0\\) and add some goal states \\(S_G\\)\n\n\nLanguage for Classical Planning: Strips\nIn order to be able to completely represent those models we use the planning languages. The simplest one and the most common is Strips, which stands for Stanford Research Institute of Problem Solver.\nIn Strips the Planning task is given by a 5-tuple of elements: \\[\\Pi = \\langle F, O, c, I, G \\rangle\\]\n\n\nF: finite set of atoms or boolean variables\n\n\nO: finite set of operators or actions of form \\(\\langle Add, Del, Pre \\rangle\\) (Add/Delete/Preconditions), each of these are a subset of atoms\n\n\n\\(c: O \\mapsto \\mathbb{R}\\): a non negative operator cost or action cost that gives a non negative value for each action\n\n\nI: initial state (a subset of atoms)\n\n\nG: goal description (also a subset of atoms)\n\n\nA plan is a sequence of applicable actions that maps I into a state consistent with G\n\n\nFrom Language to Models\nGiven a Strips planning task we can define the model as follows: a Strips Planning task \\(\\Pi = \\langle F, O, c, I, G \\rangle\\) determines state model \\(\\mathcal{S}(\\Pi)\\) where:\n\n\nthe states \\(s∈S\\) are collections of atoms from F (all possible subsets of F)\n\n\nthe initial state \\(s_0\\) is I\n\n\nthe goal states \\(s\\) are such that \\(G⊆s\\) (all states that are super sets of G)\n\n\nthe actions a in \\(A(s)\\) are ops in O s.t. \\(Pre(a)⊆s\\) (the actions in A(s) are operators in O such as the precondition is included in the state)\n\n\nand the next state is \\(s'=s− Del(a) + Add(a)\\) is generated by removing the set of delete effects and adding the set of additions\n\n\naction costs \\(c(a, s)=c(a)\\): all transitions that belongs to the same action will get the same cost\n\n\nSolutions to the plannig model are exactly the solutions for the plannig task: Solutions of \\(S(\\Pi)\\) are plans of \\(\\Pi\\)\n\n\nPlanning and Model-based Reinforcement Learning\nLet’s see how Strips planning tasks can be used to define the commonly used models in the realm. For the forward model we need to define the next state, that is done given the current state and an action: you take a state, remove the delete effects and give up the add effects. The only difference is that is done only for applicable actions, for non applicable actions is just a cell flow. \\[(a, s_i) \\rightarrow s_{(i+1)} = s_i−Del(a)+Add(a)~if~Pre(a)⊆s_i, (a,s_i) \\mapsto s_i~ otherwise\\]\nFor the backwards/reverse model, we regress an action in a state by removing the add effects and adding preconditions. This is only valid for actions whose delete effects don’t appear in the state. \\[s_{(i+1)} \\rightarrow (a, s_i)~where~Del(a)∩s_{(i+1)}=∅~and~s_i = s_{(i+1)} − Add(a) + Pre(a)\\]\nFor inverse model we can find an action whose preconditions are included in the \\(s_i\\) and a plan which would result in \\(s_{(i+1)\\). That’s can be done just by going over all applicable actions. \\[(s_{(i+1)}, s_i) \\rightarrow a,~where~ Pre(a)⊆s_i~and~s_{(i+1)} = s_i − Del(a) + Add(a)\\]\nThe rewards approximate the negative of the true optimal cost \\(-c^*(s_{(i+1)})\\) of reaching the goal (reward obtanaible) from the state \\(s_{(i+1)}:(s_i, a, s_{(i+1)}) \\rightarrow h(s_{(i+1)}) − h(s_i)\\)\n\n\nComputational problems of Classical Planning\n\n\nCost of ptimal planning: the cost of find a plan (sequence of actions) that minimizes the summed action cost\n\n\nIn satisficing planning: we want to find any plan improving the plan cost as much as possible (the cheaper plan)\n\n\nIn agile planning: we care only about how quickly a plan can found ignoring completely its quantity\n\n\nFor top-k planning: find k plans such that no cheaper plans exist (basically genrealizes the cost optimal planning)\n\n\nIn top-quality planning: we search for all possible plans up to a certain cost\n\n\nDiverse planning: deals with both plan quality and plan diversity (variety of problems, aiming at obtaining diverse set of plans considering plan quality as well)\n\n\nReferences\nAAAI 2022 Tutorial on AI Planning: Theory and Practice https://aiplanning-tutorial.github.io/ https://www.youtube.com/watch?v=q-mShBwHkc4 https://www.youtube.com/watch?v=PKISCipS9Og https://planning.wiki/ https://www.youtube.com/watch?v=XW0z8Oik6G8&list=PL1Q0jeuU6XppflOPFx1qQVuWbXTcjxevU https://www.youtube.com/watch?v=_NOVa4i7Us8&list=PL1Q0jeuU6XppflOPFx1qQVuWbXTcjxevU&index=7"
},
{
"objectID": "portfolio/air-quality.html",
"href": "portfolio/air-quality.html",
"title": "Air Quality",
"section": "",
"text": "import numpy as np\nimport matplotlib.pyplot as plt\n\nr = np.arange(0, 2, 0.01)\ntheta = 2 * np.pi * r\nfig, ax = plt.subplots(\n subplot_kw = {'projection': 'polar'} \n)\nax.plot(theta, r)\nax.set_rticks([0.5, 1, 1.5, 2])\nax.grid(True)\nplt.show()\n\n\n\n\nFigure 1: A line plot on a polar axis\n\n\n\n\n\nx = 10\nx + 5"
},
{
"objectID": "posts/the-hawthorne-effect/index.html",
"href": "posts/the-hawthorne-effect/index.html",
"title": "The Hawthorne Effect: How Being Watched Can Affect Your Behavior and Statistical Studies",
"section": "",
"text": "Have you ever noticed that your behavior changes when you know you’re being watched? This phenomenon is known as the Hawthorne Effect, named after a series of studies conducted in the 1920s at the Hawthorne Works factory in Chicago.\nThe studies were originally conducted to investigate the relationship between lighting levels in the workplace and worker productivity. Researchers found that productivity increased when lighting levels were improved, but also when they were decreased. After further investigation, they found that workers’ behavior changed simply because they knew they were being observed.\nThe Hawthorne Effect has since been replicated in a variety of settings, including schools, hospitals, and even in studies of consumer behavior. It’s important to note that the effect can be positive or negative, depending on the context. For example, in a classroom setting, the Hawthorne Effect might cause students to perform better when they know they’re being watched, but it could also cause them to feel anxious and perform worse.\nWhen conducting a statistical study, researchers aim to gather data that accurately reflects the true state of the population being studied. However, the Hawthorne Effect can introduce a bias into the data, as participants may alter their behavior or responses in response to the study itself. This can lead to a “study effect” that is not related to the treatment being studied, but rather to the participants’ awareness of the study itself.\nFor example, imagine a study on the effectiveness of a new teaching method in improving student performance. If the study participants (i.e. the students) are aware that they are being studied, they may behave differently than they would in a normal classroom setting. They may pay closer attention to the lessons, study more, or even try to please the researchers by performing better than they normally would. This can make it difficult to draw accurate conclusions about the effectiveness of the method , and can limit the generalizability of the results to the wider population.\nTo minimize the impact of the Hawthorne Effect on statistical studies, researchers may use a variety of strategies. These can include blinding participants to the purpose of the study, using control groups to compare the effects of the treatment to a baseline, or using naturalistic observation to observe behavior in a more authentic setting.\nIt’s important to note that the Hawthorne Effect is not always negative, and can sometimes be used to improve study outcomes. For example, researchers may intentionally manipulate the study environment to increase participant motivation or engagement, which can improve the reliability and validity of the study results.\nIn conclusion, the Hawthorne Effect demonstrates the powerful impact that observation can have on human behavior. By being aware of this phenomenon, we can use it to improve our own performance and create more positive environments in our workplaces and other settings. However, in statistical studies, the Hawthorne Effect can introduce bias and make it difficult to draw accurate conclusions. Researchers must be aware of this phenomenon when designing and interpreting their studies, and use appropriate research designs to minimize its impact."
},
{
"objectID": "posts/the-hawthorne-effect/index.html#references",
"href": "posts/the-hawthorne-effect/index.html#references",
"title": "The Hawthorne Effect: How Being Watched Can Affect Your Behavior and Statistical Studies",
"section": "References",
"text": "References\nFriedman, L. M., & Furberg, C. D. (1998). DeBiasing through an awareness of the Hawthorne effect. Academic Medicine, 73(12), 1316-1318.\nMcCarney, R., Warner, J., Iliffe, S., van Haselen, R., Griffin, M., & Fisher, P. (2007). The Hawthorne effect: a randomised, controlled trial. BMC medical research methodology, 7(1), 30.\nStang, A. (2010). Critical evaluation of the Hawthorne effect from a methodological perspective. Scandinavian Journal of Work, Environment & Health, 36(2), 163-171."
},
{
"objectID": "posts/welcome/index.html",
"href": "posts/welcome/index.html",
"title": "Welcome To My Blog",
"section": "",
"text": "This is the first post in a Quarto blog. Welcome!\n\nSince this post doesn’t specify an explicit image, the first image in the post will be used in the listing page of posts."
},
{
"objectID": "index.html",
"href": "index.html",
"title": "CGoDataScience",
"section": "",
"text": "Hi there! My name is Carmen Gómez Valenzuela and I am a Data Scientist. This website serves as both my portfolio and personal notepad where I share my insights and ideas about Data Analysis, Data Science, Artificial Intelligence, and related fields.\nIn addition, I also run a blog where I share interesting discoveries from around the world.\nPlease note that this site, much like life, is constantly “under construction”. If you are interested in Data or Artificial Intelligence, or would like to discuss potential collaboration opportunities, feel free to contact me on LinkedIn.\nBest regards!"
},
{
"objectID": "multivariate_analysis/2- Principal_Components_Analysis.html",
"href": "multivariate_analysis/2- Principal_Components_Analysis.html",
"title": "2. Principal Component Analysis",
"section": "",
"text": "PCA Algorithm for Feature Extraction\nThe following represents 6 steps of principal component analysis (PCA) algorithm:\nStandardize the dataset: Standardizing / normalizing the dataset is the first step one would need to take before performing PCA. The PCA calculates a new projection of the given data set representing one or more features. The new axes are based on the standard deviation of the value of these features. So, a feature / variable with a high standard deviation will have a higher weight for the calculation of axis than a variable / feature with a low standard deviation. If the data is normalized / standardized, the standard deviation of all fetaures / variables get measured on the same scale. Thus, all variables have the same weight and PCA calculates relevant axis appropriately. Note that the data is standardized / normalized after creating training / test split. Python’s sklearn.preprocessing StandardScaler class can be used for standardizing the dataset. Construct the covariance matrix: Once the data is standardized, the next step is to create n X n-dimensional covariance matrix, where n is the number of dimensions in the dataset. The covariance matrix stores the pairwise covariances between the different features. Note that a positive covariance between two features indicates that the features increase or decrease together, whereas a negative covariance indicates that the features vary in opposite directions. Python‘s Numpy cov method can be used to create covariance matrix. Perform Eigendecomposition of covariance matrix: The next step is to decompose the covariance matrix into its eigenvectors and eigenvalues. The eigenvectors of the covariance matrix represent the principal components (the directions of maximum variance), whereas the corresponding eigenvalues will define their magnitude. Numpy linalg.eig or linalg.eigh can be used for decomposing covariance matrix into eigenvectors and eigenvalues. Selection of most important Eigenvectors / Eigenvalues: Sort the eigenvalues by decreasing order to rank the corresponding eigenvectors. Select k eigenvectors, which correspond to the k largest eigenvalues, where k is the dimensionality of the new feature subspace (). One can used the concepts of explained variance to select the k most important eigenvectors. Projection matrix creation of important eigenvectors: Construct a projection matrix, W, from the top k eigenvectors. Training / test dataset transformation: Finally, transform the d-dimensional input training and test dataset using the projection matrix to obtain the new k-dimensional feature subspace."
},
{
"objectID": "multivariate_analysis/2-Principal_Components_Analysis.html",
"href": "multivariate_analysis/2-Principal_Components_Analysis.html",
"title": "2. Principal Component Analysis",
"section": "",
"text": "PCA Algorithm for Feature Extraction\nThe following represents 6 steps of principal component analysis (PCA) algorithm:\nStandardize the dataset: Standardizing / normalizing the dataset is the first step one would need to take before performing PCA. The PCA calculates a new projection of the given data set representing one or more features. The new axes are based on the standard deviation of the value of these features. So, a feature / variable with a high standard deviation will have a higher weight for the calculation of axis than a variable / feature with a low standard deviation. If the data is normalized / standardized, the standard deviation of all fetaures / variables get measured on the same scale. Thus, all variables have the same weight and PCA calculates relevant axis appropriately. Note that the data is standardized / normalized after creating training / test split. Python’s sklearn.preprocessing StandardScaler class can be used for standardizing the dataset. Construct the covariance matrix: Once the data is standardized, the next step is to create n X n-dimensional covariance matrix, where n is the number of dimensions in the dataset. The covariance matrix stores the pairwise covariances between the different features. Note that a positive covariance between two features indicates that the features increase or decrease together, whereas a negative covariance indicates that the features vary in opposite directions. Python‘s Numpy cov method can be used to create covariance matrix. Perform Eigendecomposition of covariance matrix: The next step is to decompose the covariance matrix into its eigenvectors and eigenvalues. The eigenvectors of the covariance matrix represent the principal components (the directions of maximum variance), whereas the corresponding eigenvalues will define their magnitude. Numpy linalg.eig or linalg.eigh can be used for decomposing covariance matrix into eigenvectors and eigenvalues. Selection of most important Eigenvectors / Eigenvalues: Sort the eigenvalues by decreasing order to rank the corresponding eigenvectors. Select k eigenvectors, which correspond to the k largest eigenvalues, where k is the dimensionality of the new feature subspace (). One can used the concepts of explained variance to select the k most important eigenvectors. Projection matrix creation of important eigenvectors: Construct a projection matrix, W, from the top k eigenvectors. Training / test dataset transformation: Finally, transform the d-dimensional input training and test dataset using the projection matrix to obtain the new k-dimensional feature subspace."
},
{
"objectID": "automated_planning.html",
"href": "automated_planning.html",
"title": "Automated planning",
"section": "",
"text": "“Planning is the art and practice of thinking before acting: of reviewing the courses of action one has available and predicting their expected (and unexpected) results to be able to choose the course of action most beneficial with respect to one’s goals.” –Patrik Haslum–\n\nIntroduction\nBasic Planning Problem: given descriptions of possible initial states of the world, desired goals and a set of possible actions, synthesize a plan that is guaranteed to generate a state which contains the desired goals.\nIn artificial intelligence, planning refers to the process of generating a sequence of actions to achieve a specific goal or set of goals. Planning involves finding an optimal sequence of actions that can transform an initial state of the world into a desired goal state while satisfying any relevant constraints and optimizing some objective function.\nIn summary, AI Planning is the model-based approach to action selection: prouduces the behaviour from the model (solves the model)\n\n\nClassical Planning Model\nClassical planning model is a tuple of six elements: \\[\\mathcal{S} = \\langle S, s_0, S_G, A, f, c \\rangle \\] where:\n\n\nthe first is the finite set of states \\(S\\)\n\n\none of this states is called the initial state \\(s_0∈S\\)\n\n\nsome known non empty subset of these are called goal states: \\(S_G⊆S\\)\n\n\na final set of actions \\(A(s) ⊆ A\\) that if a subset of applicable actions defined for each state \\(s∈S\\)\n\n\nA deterministic transition function maps a state s and an applicable action a into a resulting state s’ \\[s' = f(a, s)~for~a∈A(s)\\]\n\n\nfinally each transition has associated a non-negative action cost \\(c(a, s)\\)\n\n\nSo solutions are sequences of applicable actions that start at the initial state \\(s_0\\) and add some goal states \\(S_G\\)\n\n\nLanguage for Classical Planning: Strips\nIn order to be able to completely represent those models we use the planning languages. The simplest one and the most common is Strips, which stands for Stanford Research Institute of Problem Solver.\nIn Strips the Planning task is given by a 5-tuple of elements: \\[\\Pi = \\langle F, O, c, I, G \\rangle\\]\n\n\nF: finite set of atoms or boolean variables\n\n\nO: finite set of operators or actions of form \\(\\langle Add, Del, Pre \\rangle\\) (Add/Delete/Preconditions), each of these are a subset of atoms\n\n\n\\(c: O \\mapsto \\mathbb{R}\\): a non negative operator cost or action cost that gives a non negative value for each action\n\n\nI: initial state (a subset of atoms)\n\n\nG: goal description (also a subset of atoms)\n\n\nA plan is a sequence of applicable actions that maps I into a state consistent with G\n\n\nFrom Language to Models\nGiven a Strips planning task we can define the model as follows: a Strips Planning task \\(\\Pi = \\langle F, O, c, I, G \\rangle\\) determines state model \\(\\mathcal{S}(\\Pi)\\) where:\n\n\nthe states \\(s∈S\\) are collections of atoms from F (all possible subsets of F)\n\n\nthe initial state \\(s_0\\) is I\n\n\nthe goal states \\(s\\) are such that \\(G⊆s\\) (all states that are super sets of G)\n\n\nthe actions a in \\(A(s)\\) are ops in O s.t. \\(Pre(a)⊆s\\) (the actions in A(s) are operators in O such as the precondition is included in the state)\n\n\nand the next state is \\(s'=s− Del(a) + Add(a)\\) is generated by removing the set of delete effects and adding the set of additions\n\n\naction costs \\(c(a, s)=c(a)\\): all transitions that belongs to the same action will get the same cost\n\n\nSolutions to the plannig model are exactly the solutions for the plannig task: Solutions of \\(S(\\Pi)\\) are plans of \\(\\Pi\\)\n\n\nPlanning and Model-based Reinforcement Learning\nLet’s see how Strips planning tasks can be used to define the commonly used models in the realm. For the forward model we need to define the next state, that is done given the current state and an action: you take a state, remove the delete effects and give up the add effects. The only difference is that is done only for applicable actions, for non applicable actions is just a cell flow. \\[(a, s_i) \\rightarrow s_{(i+1)} = s_i−Del(a)+Add(a)~if~Pre(a)⊆s_i, (a,s_i) \\mapsto s_i~ otherwise\\]\nFor the backwards/reverse model, we regress an action in a state by removing the add effects and adding preconditions. This is only valid for actions whose delete effects don’t appear in the state. \\[s_{(i+1)} \\rightarrow (a, s_i)~where~Del(a)∩s_{(i+1)}=∅~and~s_i = s_{(i+1)} − Add(a) + Pre(a)\\]\nFor inverse model we can find an action whose preconditions are included in the \\(s_i\\) and a plan which would result in \\(s_{(i+1)\\). That’s can be done just by going over all applicable actions. \\[(s_{(i+1)}, s_i) \\rightarrow a,~where~ Pre(a)⊆s_i~and~s_{(i+1)} = s_i − Del(a) + Add(a)\\]\nThe rewards approximate the negative of the true optimal cost \\(-c^*(s_{(i+1)})\\) of reaching the goal (reward obtanaible) from the state \\(s_{(i+1)}:(s_i, a, s_{(i+1)}) \\rightarrow h(s_{(i+1)}) − h(s_i)\\)\n\n\nComputational problems of Classical Planning\n\n\nCost of ptimal planning: the cost of find a plan (sequence of actions) that minimizes the summed action cost\n\n\nIn satisficing planning: we want to find any plan improving the plan cost as much as possible (the cheaper plan)\n\n\nIn agile planning: we care only about how quickly a plan can found ignoring completely its quantity\n\n\nFor top-k planning: find k plans such that no cheaper plans exist (basically genrealizes the cost optimal planning)\n\n\nIn top-quality planning: we search for all possible plans up to a certain cost\n\n\nDiverse planning: deals with both plan quality and plan diversity (variety of problems, aiming at obtaining diverse set of plans considering plan quality as well)\n\n\nReferences\nAAAI 2022 Tutorial on AI Planning: Theory and Practice https://aiplanning-tutorial.github.io/ https://www.youtube.com/watch?v=q-mShBwHkc4 https://www.youtube.com/watch?v=PKISCipS9Og https://planning.wiki/ https://www.youtube.com/watch?v=XW0z8Oik6G8&list=PL1Q0jeuU6XppflOPFx1qQVuWbXTcjxevU https://www.youtube.com/watch?v=_NOVa4i7Us8&list=PL1Q0jeuU6XppflOPFx1qQVuWbXTcjxevU&index=7"
}
]