Friday, December 18, 2015

Introduction to analysis Rosenlicht, chapter 3: metric spaces

Definition of metric spaces

Definition. A metric space is a set , together with a rule which associates with each pair a real number such that:

Proposition (Schwarz inequality):
enter image description here
Cauchy–Schwarz inequality in vector form:

Corollary of Schwarz inequality:
enter image description here

Or in vector form:

Proposition: generalize the triangle inequality to multi-angle inequality:
enter image description here

Proposition: difference of two sides of a triangle is less than the third side.
enter image description here

Open and closed sets

Definition of open and closed balls:
enter image description here

Definition of open set:
enter image description here

Proposition: basic properties of metric spaces:
enter image description here

Proposition: an open ball is also an open set.

Definition of closed sets:
enter image description here

Proposition: a closed ball is also a closed set.

Definition of boundedness:
enter image description here

Proposition: a nonempty closed subset of contains its extrema.
enter image description here

Convergence

Definition of convergence:
enter image description here

Uniqueness of convergence:
enter image description here

Definition of subsequence:
enter image description here

Subsequence of a convergent sequence is also convergent:
enter image description here

Convergent sequences are bounded:
enter image description here
enter image description here

Theorem. S is closed iff. convergence always occurs inside.
enter image description here
enter image description here

Proposition. If for two sequences with limits respectively, and always holds, then .

Proposition. A bounded monotonic sequence of real numbers is convergent.
enter image description here

Completeness

Definition of Cauchy sequence: elements are guaranteed to be arbitrarily close to each other (eventually).
enter image description here

Proposition, all convergent seqs in the metric space are Cauchy.
enter image description here
Because eventual closeness to the limit implies eventual closeness to each other.
enter image description here

enter image description here
enter image description here

Proposition: A Cauchy sequence that has a convergent subsequence is itself convergent.
enter image description here

Definition. A metric space is complete if every Cauchy sequence of points in E converges to a point in E.

Theorem: R is complete.

Theorem: For any positive integer n, is complete.

Definition of compactness:
enter image description here

Proposition. A compact subset of a metric space is bounded. In particular, a compact metric space is bounded.

Proposition: Nested set property.
enter image description here

Definition of cluster point.
enter image description here

Theorem: An infinite subset of a compact metric space has at least one cluster point.
enter image description here
enter image description here

Wednesday, December 16, 2015

Math snippets

Compare with , note that iff .

Cauchy–Schwarz inequality in vector form:

We have two seqs: , where . Obviously , but

Proposition: generalize the triangle inequality to multi-angle inequality:
enter image description here

Thursday, November 26, 2015

De Méré's puzzle

import scala.collection.mutable.ArrayBuffer

object DeMere {
  var sum11, sum12 = ArrayBuffer.empty[Array[Int]]
  for{
  i <- 1 to 6
  j <- 1 to 6
  k <- 1 to 6
  } {
    val s = i + j + k
    s match {
      case 11 => sum11 += Array(i, j, k)
      case 12 => sum12 += Array(i, j, k)
      case _ =>
    }
  }
  val total = math.pow(6, 3)
  sum11.length / total
  sum12.length / total
}

Monday, November 23, 2015

Machine learning foundation 2

Hoefding’s inequality

Sample mean is unlikely to far from true mean when sample size is large.


, where is the sample mean and is the true mean. In other words, is probably approximately correct (PAC).

When applied to machine learning, this means:

Sunday, November 22, 2015

Machine learning foundation 1

Elements

We have

  • Unknown target function (underlying the data)
  • Training examples (the data)
  • Hypothesis set (possible approximations to the target function)
  • Learning algorithm (for choosing the “best” from the hypothesis set)
  • Final hypothesis (result of learning)

Perceptrons

Hypothesis in the vector form

is a vector here.

Perceptron Learning Algorithm (PLA)

are vectors here.

  1. Start with an arbitrary , say .
  2. On the first case where , update like this: . This works because , or , or, which guarantees that the iteration of is in the direction of .
  3. Continue iterating until no mistake occurs for any in a full loop.

Questions about the PLA

  • Will it ever stop?
  • When is stops, will the result be close to the target function (unknown)?
  • Will it ever make a mistake (inside/outside your dataset)?

Linear separability

PLA does not always halt:

Solution guaranteed when the linear separable

Agreement between prediction and existing data can be summarized by:

, where is the "true and unknown" weight vector

Since always has the same sign as , this should be non-negative.
Hence existence of solution means that .

How do we know the in PLA will get close enough to ?

We can see the inner product gets larger as grows, this is a good sign, but we still need to check whether this is because they become more and more similar in direction or because becomes larger in magnitude.

Since \(t\) only grows when there is a mistake, we know \(2y_n w_t^T x_n \le 0\), hence

, where is the max magnitude of all possible . At least only limited growth is seen in .

If we assume \(w_0\) = 0, using telescope technique we can get
\[\|w_t\|^2 \le tR^2\]
, or
\[\|w_t\| \le \sqrt{tR^2}\]
If we let

, then

Using telescope collapsing we get .

To eliminate influence from magnitude we can normalize both \(w_f\) and \(w_t\) and check their inner product:

So, indeed, and are getting closer and closer to each other in direction. Let

.

And since converges to 1, we know eventually

So, not only do we know that PLA will find the solution (provided there is a solution), we also know it will find it in so many (finite) steps.

Saturday, November 14, 2015

Use apache spark in intellij

Add this line in your build.sbt:

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2"

Do something like this:

object Script3 {
  import org.apache.spark.SparkContext
  import org.apache.spark.SparkConf
  // local[4] means that you want spark to run locally with 4 threads 
  // you can use a cluster when your app is production ready, of course
  val conf = new SparkConf().setAppName("appspark").setMaster("local[4]")
  val sc = new SparkContext(conf)
  val lines = sc.textFile(getClass.getResource("/mtcars.txt").toString)
  val lineLengths = lines.map(x => x.length)
  val totalLength = lineLengths.reduce(_ + _)
}

object Script4 {
  import Script3._

  // spark logs are too verbose by default
  // i only want to messages when there is something wrong
  import org.apache.log4j.Logger
  import org.apache.log4j.Level
  Logger.getLogger("org").setLevel(Level.WARN)
  Logger.getLogger("akka").setLevel(Level.WARN)
  println(lines)
}

Friday, November 13, 2015

Visualize planet orbits with three.js

<!--
 ! Excerpted from "3D Game Programming for Kids",
 ! published by The Pragmatic Bookshelf.
 ! Copyrights apply to this code. It may not be used to create training material,
 ! courses, books, articles, and the like. Contact us if you are in doubt.
 ! We make no guarantees that this code is fit for any purpose.
 ! Visit http://www.pragmaticprogrammer.com/titles/csjava for more book information.
-->
<body></body>
<!--<script src="http://gamingJS.com/Three.js"></script>-->
<!--<script src="http://gamingJS.com/ChromeFixes.js"></script>-->
<script src="js/three.min.js"></script>
<script>
    // This is where stuff in our game will happen:
    var scene = new THREE.Scene();

    // This is what sees the stuff:
    var aspect_ratio = window.innerWidth / window.innerHeight;
    var above_cam = new THREE.PerspectiveCamera(75, aspect_ratio, 1, 1e6);
    above_cam.position.z = 1000;
    scene.add(above_cam);

    var earth_cam = new THREE.PerspectiveCamera(75, aspect_ratio, 1, 1e6);
    scene.add(earth_cam);

    var camera = above_cam;

    // This will draw what the camera sees onto the screen:
    var renderer = new THREE.WebGLRenderer();
    renderer.setSize(window.innerWidth, window.innerHeight);
    document.body.appendChild(renderer.domElement);

    // ******** START CODING ON THE NEXT LINE ********
    document.body.style.backgroundColor = 'black';

    var surface = new THREE.MeshPhongMaterial({ambient: 0xFFD700});
    var star = new THREE.SphereGeometry(50, 28, 21);
    var sun = new THREE.Mesh(star, surface);
    scene.add(sun);

    var ambient = new THREE.AmbientLight(0xffffff);
    scene.add(ambient);

    var sunlight = new THREE.PointLight(0xffffff, 15, 1000, 1);
    sun.add(sunlight);

    var surface = new THREE.MeshPhongMaterial({ambient: 0x1a1a1a, color: 0x0000cd});
    var planet = new THREE.SphereGeometry(20, 120, 115);
    var earth = new THREE.Mesh(planet, surface);
    earth.position.set(250, 0, 0);
    scene.add(earth);

    var surface = new THREE.MeshPhongMaterial({ambient: 0x1a1a1a, color: 0xb22222});
    var planet = new THREE.SphereGeometry(20, 120, 115);
    var mars = new THREE.Mesh(planet, surface);
    mars.position.set(500, 0, 0);
    scene.add(mars);

    clock = new THREE.Clock();

    function animate() {
        requestAnimationFrame(animate);

        var time = clock.getElapsedTime();

        var e_angle = time * 0.8;
        earth.position.set(250 * Math.cos(e_angle), 250 * Math.sin(e_angle), 0);

        var m_angle = time * 0.3;
        mars.position.set(500 * Math.cos(m_angle), 500 * Math.sin(m_angle), 0);

        var y_diff = mars.position.y - earth.position.y,
                x_diff = mars.position.x - earth.position.x,
                angle = Math.atan2(x_diff, y_diff);

        // http://fs5.directupload.net/images/151113/aqz9jn7v.jpg
        // camera faces the same direction as Z-axis by default
        earth_cam.rotation.set(Math.PI / 2, -angle, 0);
        earth_cam.position.set(earth.position.x, earth.position.y, 22);

        // Now, show what the camera sees on the screen:
        renderer.render(scene, camera);
    }

    animate();

    var stars = new THREE.Geometry();
    while (stars.vertices.length < 1e4) {
        var lat = Math.PI * Math.random() - Math.PI / 2;
        var lon = 2 * Math.PI * Math.random();

        stars.vertices.push(new THREE.Vector3(
                1e5 * Math.cos(lon) * Math.cos(lat),
                1e5 * Math.sin(lon) * Math.cos(lat),
                1e5 * Math.sin(lat)
        ));
    }
    var star_stuff = new THREE.ParticleBasicMaterial({size: 500});
    var star_system = new THREE.ParticleSystem(stars, star_stuff);
    scene.add(star_system);

    document.addEventListener("keydown", function (event) {
        var code = event.keyCode;

        if (code == 65) { // A
            camera = above_cam;
        }
        if (code == 69) { // E
            camera = earth_cam;
        }
    });

</script>

Monday, November 2, 2015

Javascript execution weird

var mydata = []
d3.csv("https://raw.githubusercontent.com/kindlychung/cytob/master/data/hg19.csv", function (err, data) {
    var data1 = data.filter(function (d) {
        return d.chr == "chr22";
    })
    var innerdata = [];
    for(var i = 0; i < data1.length; i++) {
        mydata.push(data1[i].cyto);
        innerdata.push(data1[i].cyto);
    }
    console.log(innerdata);
    console.log("inside", mydata);
})
console.log("outside", mydata);

Result:

outside []
test.js:11 ["p13", "p12", "p11.2", "p11.1", "q11.1", "q11.21", "q11.22", "q11.23", "q12.1", "q12.2", "q12.3", "q13.1", "q13.2", "q13.31", "q13.32", "q13.33"]
test.js:12 inside ["p13", "p12", "p11.2", "p11.1", "q11.1", "q11.21", "q11.22", "q11.23", "q12.1", "q12.2", "q12.3", "q13.1", "q13.2", "q13.31", "q13.32", "q13.33"]

It’s strange that the outside log is actually executed before the inside log.

Saturday, October 31, 2015

Learning D3.js

The basics

d3.select("body").append("svg")
    .attr("width", 50)
    .attr("height", 50)
    .append("circle")
    .attr("cx", 25)
    .attr("cy", 25)
    .attr("r", 25)
    .style("fill", "purple")

var theData = [ 1, 2, 3 ]
var p = d3.select("body").selectAll("p")
  .data(theData)
  .enter()
  .append("p")
    .text("Hello");
    //text(function () {
    //    return "hello world!";
    //});
    //.text(function (d) {
    //    return d * 2
    //});
    //.text(function (d, i) {
    //    return "Index: " + i + ", Val: " + d;
    //});

console.log(p)

Binding shape properties to data

circleRadii = [40, 20, 10]

var svgContainer = d3.select("body").append("svg")
    .attr("width", 600)
    .attr("height", 100);

var circles = svgContainer.selectAll("circle")
    .data(circleRadii)
    .enter()
    .append("circle")

var circleAttributes = circles
    .attr("cx", 50)
    .attr("cy", 50)
    .attr("r", function (d) { return d; })
    .style("fill", function(d) {
        var returnColor;
        if (d === 40) { returnColor = "green";
        } else if (d === 20) { returnColor = "purple";
        } else if (d === 10) { returnColor = "red"; }
        return returnColor;
    });

Binding styles and coordinates to data

var spaceCircles = [30, 70, 110];

var svgContainer = d3.select("body").append("svg")
    .attr("width", 200)
    .attr("height", 200);

var circles = svgContainer.selectAll("circle")
    .data(spaceCircles)
    .enter()
    .append("circle");

var circleAttributes = circles
    .attr("cx", function (d) { return d; })
    .attr("cy", function (d) { return d; })
    .attr("r", 20 )
    .style("fill", function(d) {
        var returnColor;
        if (d === 30) { returnColor = "green";
        } else if (d === 70) { returnColor = "purple";
        } else if (d === 110) { returnColor = "red"; }
        return returnColor;
    });

Using json object as data

var jsonCircles = [
    {
        "x_axis": 30,
        "y_axis": 30,
        "radius": 20,
        "color" : "green"
    }, {
        "x_axis": 70,
        "y_axis": 70,
        "radius": 20,
        "color" : "purple"
    }, {
        "x_axis": 110,
        "y_axis": 100,
        "radius": 20,
        "color" : "red"
    }];

var svgContainer = d3.select("body").append("svg")
    .attr("width", 200)
    .attr("height", 200);


var circles = svgContainer.selectAll("circle")
    .data(jsonCircles)
    .enter()
    .append("circle");

var circleAttributes = circles
    .attr("cx", function (d) { return d.x_axis; })
    .attr("cy", function (d) { return d.y_axis; })
    .attr("r", function (d) { return d.radius; })
    .style("fill", function(d) { return d.color; });

Polylines

//The data for our line
var lineData = [ { "x": 1,   "y": 5},  { "x": 20,  "y": 20},
    { "x": 40,  "y": 10}, { "x": 60,  "y": 40},
    { "x": 80,  "y": 5},  { "x": 100, "y": 60}];

//This is the accessor function we talked about above
var lineFunction = d3.svg.line()
    .x(function(d) { return d.x; })
    .y(function(d) { return d.y; })
    .interpolate("linear");

//The SVG Container
var svgContainer = d3.select("body").append("svg")
    .attr("width", 200)
    .attr("height", 200);

//The line SVG Path we draw
var lineGraph = svgContainer.append("path")
    .attr("d", lineFunction(lineData))
    .attr("stroke", "blue")
    .attr("stroke-width", 2)
    .attr("fill", "none");

Dynamically determine size of svg

var jsonRectangles = [
    { "x_axis": 10, "y_axis": 10, "height": 20, "width":20, "color" : "green" },
    { "x_axis": 160, "y_axis": 40, "height": 20, "width":20, "color" : "purple" },
    { "x_axis": 70, "y_axis": 70, "height": 20, "width":20, "color" : "red" }];

var max_x = 0;
var max_y = 0;

for (var i = 0; i < jsonRectangles.length; i++) {
    var temp_x, temp_y;
    var temp_x = jsonRectangles[i].x_axis + jsonRectangles[i].width;
    var temp_y = jsonRectangles[i].y_axis + jsonRectangles[i].height;

    if ( temp_x >= max_x ) { max_x = temp_x; }

    if ( temp_y >= max_y ) { max_y = temp_y; }
}

var svgContainer = d3.select("body").append("svg")
    .attr("width", max_x)
    .attr("height", max_y)

var rectangles = svgContainer.selectAll("rect")
    .data(jsonRectangles)
    .enter()
    .append("rect");

var rectangleAttributes = rectangles
    .attr("x", function (d) { return d.x_axis; })
    .attr("y", function (d) { return d.y_axis; })
    .attr("height", function (d) { return d.height; })
    .attr("width", function (d) { return d.width; })
    .style("fill", function(d) { return d.color; });

Linear scale

var initialScaleData = [0, 1000, 3000, 2000, 5000, 4000, 7000, 6000, 9000, 8000, 10000];

var newScaledData = [];
var minDataPoint = d3.min(initialScaleData);
var maxDataPoint = d3.max(initialScaleData);

var linearScale = d3.scale.linear()
                           .domain([minDataPoint,maxDataPoint])
                           .range([0,100]);

for (var i = 0; i < initialScaleData.length; i++) {
  newScaledData[i] = linearScale(initialScaleData[i]);
}

newScaledData;
//[0, 10, 30, 20, 50, 40, 70, 60, 90, 80, 100]

Transformation

var circleData = [
    { "cx": 20, "cy": 20, "radius": 20, "color" : "green" },
    { "cx": 70, "cy": 70, "radius": 20, "color" : "purple" }];


var rectangleData = [
    { "rx": 110, "ry": 110, "height": 30, "width": 30, "color" : "blue" },
    { "rx": 160, "ry": 160, "height": 30, "width": 30, "color" : "red" }];

var svgContainer = d3.select("body").append("svg")
    .attr("width",200)
    .attr("height",200);

var circleGroup = svgContainer.append("g")
    .attr("transform", "translate(80,0)")

var circles = circleGroup.selectAll("circle")
    .data(circleData)
    .enter()
    .append("circle");

var circleAttributes = circles
    .attr("cx", function (d) { return d.cx; })
    .attr("cy", function (d) { return d.cy; })
    .attr("r", function (d) { return d.radius; })
    .style("fill", function (d) { return d.color; });

var rectangles = svgContainer.selectAll("rect")
    .data(rectangleData)
    .enter()
    .append("rect");

var rectangleAttributes = rectangles
    .attr("x", function (d) { return d.rx; })
    .attr("y", function (d) { return d.ry; })
    .attr("height", function (d) { return d.height; })
    .attr("width", function (d) { return d.width; })
    .style("fill", function(d) { return d.color; });

Adding text

//Circle Data Set
var circleData = [
    { "cx": 20, "cy": 20, "radius": 20, "color" : "green" },
    { "cx": 70, "cy": 70, "radius": 20, "color" : "purple" }];

//Create the SVG Viewport
var svgContainer = d3.select("#svgContainer")
    .attr("width",200)
    .attr("height",200);

//Add the SVG Text Element to the svgContainer
var text = svgContainer.selectAll("text")
    .data(circleData)
    .enter()
    .append("text");

var circles = svgContainer.selectAll("circle")
    .data(circleData)
    .enter()
    .append("circle")
    .attr("cx", function(d) {return d.cx})
    .attr("cy", function(d) {return d.cy})
    .attr("r", function(d) {return d.radius})
    .attr("fill", function(d) {return d.color})

//Add SVG Text Element Attributes
var textLabels = text
    .attr("x", function(d) { return d.cx; })
    .attr("y", function(d) { return d.cy; })
    .text( function (d) { return "( " + d.cx + ", " + d.cy +" )"; })
    .attr("font-family", "sans-serif")
    .attr("font-size", "20px")
    .attr("fill", "red");

Thinking in data joins: update, enter, exit


var width = 960,
    height = 500;

var svg = d3.select("body").append("svg")
    .attr("width", width)
    .attr("height", height)
    .append("g")
    .attr("transform", "translate(32," + (height / 2) + ")");

function update(data) {

    // DATA JOIN
    // Join new data with old elements, if any.
    var text = svg.selectAll("text")
        .data(data);

    // UPDATE
    // Update old elements as needed.
    // Initially this part is empty
    text.attr("class", "update").attr("fill", "blue");

    // ENTER
    // Create new elements as needed.
    text.enter().append("text")
        .attr("class", "enter")
        .attr("fill", "green")
        .attr("x", function(d, i) { return i * 32; })
        .attr("dy", ".35em");

    // ENTER + UPDATE
    // Appending to the enter selection expands the update selection to include
    // entering elements; so, operations on the update selection after appending to
    // the enter selection will apply to both entering and updating nodes.
    text.text(function(d) { return d; });

    // EXIT
    // Remove old elements as needed.
    text.exit().remove();
}

In the javascript console:

update([1, 2])

enter image description here

update([1, 2, 3, 4])

enter image description here

update([1, 2, 3, 4, 5, 6])

enter image description here

Now with a touch of animation (transition in coordinates):

var width = 960,
    height = 500;

var svg = d3.select("body").append("svg")
    .attr("width", width)
    .attr("height", height)
    .append("g")
    .attr("transform", "translate(32," + (height / 2) + ")");

function update(data) {
    var text = svg.selectAll("text")
        .data(data);
    text.attr("class", "update").attr("fill", "blue");
    text.enter().append("text")
        .attr("class", "enter")
        .attr("fill", "green")
        .attr("x", function(d, i) { return i * 32; })
        .attr("y", -50)
        .transition()
        .attr("y", 0)
        .attr("dy", ".35em");
    text.text(function(d) { return d; });
    text.exit()
        .attr("y", 0)
        .transition()
        .attr("y", -50)
        .remove();
}

Nested Selection

var tableBody = d3.select("body").append("table").append("tbody");
//var cells = tableBody.selectAll("tr").selectAll("td")
//    .style("color", function(d, i, j) {return i === j ? "green" : "lightblue"})

var matrix = [
    [ 0,  1,  2,  3],
    [ 4,  5,  6,  7],
    [ 8,  9, 10, 11],
    [12, 13, 14, 15],
];
var cells = tableBody.selectAll("tr")
    .data(matrix)
    .enter()
    .append("tr")
    .selectAll("td")
    .data(function(d, i) {return d})
    .enter()
    .append("td")
    .html(function(d) {return "<b>" + d + "</b>";});