vrproject

Complex Network Analysis VR-project
git clone git://popovic.xyz/vrproject.git
Log | Files | Refs | LICENSE

ideas.ipynb (4444B)


      1 {
      2  "cells": [
      3   {
      4    "cell_type": "code",
      5    "execution_count": 32,
      6    "id": "b89df471",
      7    "metadata": {},
      8    "outputs": [],
      9    "source": [
     10     "import pypi_xmlrpc\n",
     11     "import requests"
     12    ]
     13   },
     14   {
     15    "cell_type": "code",
     16    "execution_count": 3,
     17    "id": "f0f01af0",
     18    "metadata": {},
     19    "outputs": [],
     20    "source": []
     21   },
     22   {
     23    "cell_type": "code",
     24    "execution_count": 9,
     25    "id": "777549f7",
     26    "metadata": {},
     27    "outputs": [],
     28    "source": []
     29   },
     30   {
     31    "cell_type": "code",
     32    "execution_count": 31,
     33    "id": "22d4fb60",
     34    "metadata": {},
     35    "outputs": [
     36     {
     37      "name": "stdout",
     38      "output_type": "stream",
     39      "text": [
     40       "requests:\n",
     41       "{\n",
     42       "    \"comment_text\": \"\",\n",
     43       "    \"digests\": {\n",
     44       "        \"md5\": \"c90a48af18eb4170dbe4832c1104440c\",\n",
     45       "        \"sha256\": \"210a82e678c45d433a4ad1f105974b3102a8ab5198872dc0a3238a8750d4c65e\"\n",
     46       "    },\n",
     47       "    \"downloads\": -1,\n",
     48       "    \"filename\": \"requests-0.10.0.tar.gz\",\n",
     49       "    \"has_sig\": false,\n",
     50       "    \"md5_digest\": \"c90a48af18eb4170dbe4832c1104440c\",\n",
     51       "    \"packagetype\": \"sdist\",\n",
     52       "    \"python_version\": \"source\",\n",
     53       "    \"requires_python\": null,\n",
     54       "    \"size\": 62046,\n",
     55       "    \"upload_time\": \"2012-01-22T05:08:17\",\n",
     56       "    \"upload_time_iso_8601\": \"2012-01-22T05:08:17.091441Z\",\n",
     57       "    \"url\": \"https://files.pythonhosted.org/packages/62/35/0230421b8c4efad6624518028163329ad0c2df9e58e6b3bee013427bf8f6/requests-0.10.0.tar.gz\",\n",
     58       "    \"yanked\": false,\n",
     59       "    \"yanked_reason\": null\n",
     60       "}\n"
     61      ]
     62     }
     63    ],
     64    "source": []
     65   },
     66   {
     67    "cell_type": "markdown",
     68    "id": "0fac5d60",
     69    "metadata": {},
     70    "source": [
     71     "***IDEA*** <br>\n",
     72     "For packege in all packages:\n",
     73     "\n",
     74     "    1 get json\n",
     75     "    2 find the **first** release version with a date and save date\n",
     76     "    3 open a file in the format 'year-month-network.csv' representing the \n",
     77     "        year and moth the package was created(grouping packages together \n",
     78     "        that are released the same moth). If allready open do nothing\n",
     79     "    4 write in file repository|[dependencies...]\n",
     80     "END; 5 : close all files.\n",
     81     "\n",
     82     "\n",
     83     "***Problem***\n",
     84     "When making this kind of time-dependent network where we add\n",
     85     "nodes the dependencies may not yet exist, i.e. the development team decided\n",
     86     "to add dependencies after the first version. I have no way of checking the \n",
     87     "dependencies of each version"
     88    ]
     89   },
     90   {
     91    "cell_type": "markdown",
     92    "id": "378be17f",
     93    "metadata": {},
     94    "source": [
     95     "***Addressing the problem****\n",
     96     "\n",
     97     "The problem where the requirement of a package has not been released yet\n",
     98     "lets say we have a dictonary of pd.DataFrames, where each key corresponds \n",
     99     "to a year-date sorted based on the date of the data.\n",
    100     "\n",
    101     "FOR each entry in the dictonary do:\n",
    102     "    FOR each entry (package, requirement) in the dataframe do:\n",
    103     "    \n",
    104     "        1 check if the ***requirement*** is in any of the dataframes from before as package\n",
    105     "        2 if yes : pass\n",
    106     "        3 if not : delete the requirement and create a standalone package(node), additionally\n",
    107     "            save both package and requirement together (as a tuple) in a list, say the cache list:\n",
    108     "        4 check if the package is in the cache list as a ***requirement***\n",
    109     "        5 if yes: append the pair package requirement found in the cache list to the \n",
    110     "            dataframe and delete entry in the cache list\n",
    111     "        6 if not: pass\n",
    112     "        7 lastly update the dataframe by making a ***set*** out of it to avoid duplicate nodes, e.g.\n",
    113     "            if requirement deleted (use pd.DataFrame.drop_duplicates())\n",
    114     "            \n",
    115     "    DONE\n",
    116     "DONE"
    117    ]
    118   }
    119  ],
    120  "metadata": {
    121   "kernelspec": {
    122    "display_name": "Python 3",
    123    "language": "python",
    124    "name": "python3"
    125   },
    126   "language_info": {
    127    "codemirror_mode": {
    128     "name": "ipython",
    129     "version": 3
    130    },
    131    "file_extension": ".py",
    132    "mimetype": "text/x-python",
    133    "name": "python",
    134    "nbconvert_exporter": "python",
    135    "pygments_lexer": "ipython3",
    136    "version": "3.10.5"
    137   }
    138  },
    139  "nbformat": 4,
    140  "nbformat_minor": 5
    141 }