## Definition of metric spaces

Definition. A metric space is a set $E$, together with a rule which associates with each pair $p, q \in E$ a real number $d \left( p, q \right)$ such that:

Proposition (Schwarz inequality):

Cauchy–Schwarz inequality in vector form:

Corollary of Schwarz inequality:

Or in vector form:

Proposition: generalize the triangle inequality to multi-angle inequality:

Proposition: difference of two sides of a triangle is less than the third side.

## Open and closed sets

Definition of open and closed balls:

Definition of open set:

Proposition: basic properties of metric spaces:

Proposition: an open ball is also an open set.

Definition of closed sets:

Proposition: a closed ball is also a closed set.

Definition of boundedness:

Proposition: a nonempty closed subset of $R$ contains its extrema.

## Convergence

Definition of convergence:

Uniqueness of convergence:

Definition of subsequence:

Subsequence of a convergent sequence is also convergent:

Convergent sequences are bounded:

Theorem. S is closed iff. convergence always occurs inside.

Proposition. If for two sequences $a_n, b_n$ with limits $a, b$ respectively, and $a_n \le b_n$ always holds, then $a \le b$.

Proposition. A bounded monotonic sequence of real numbers is convergent.

## Completeness

Definition of Cauchy sequence: elements are guaranteed to be arbitrarily close to each other (eventually).

Proposition, all convergent seqs in the metric space are Cauchy.

Because eventual closeness to the limit implies eventual closeness to each other.

Proposition: A Cauchy sequence that has a convergent subsequence is itself convergent.

Definition. A metric space is complete if every Cauchy sequence of points in E converges to a point in E.

Theorem: R is complete.

Theorem: For any positive integer n, $E^n$ is complete.

Definition of compactness:

Proposition. A compact subset of a metric space is bounded. In particular, a compact metric space is bounded.

Proposition: Nested set property.

Definition of cluster point.

Theorem: An infinite subset of a compact metric space has at least one cluster point.

## Math snippets

Compare $R^+$ with $A = \left\{ x^2 | x \in R^+ \text{ and } x \ne 0 \right\}$, note that $m \in R^+$ iff $m \in A$.

Cauchy–Schwarz inequality in vector form:
$\left| a \cdot b \right| \le \left| a \right| \left| b \right|$

We have two seqs: $a_n = 1/n, b_n = 1/n^2$, where $n = 2, 3, 4, 5, ...$. Obviously $a_n > b_n$, but

Proposition: generalize the triangle inequality to multi-angle inequality:

## De Méré's puzzle

import scala.collection.mutable.ArrayBuffer

object DeMere {
var sum11, sum12 = ArrayBuffer.empty[Array[Int]]
for{
i <- 1 to 6
j <- 1 to 6
k <- 1 to 6
} {
val s = i + j + k
s match {
case 11 => sum11 += Array(i, j, k)
case 12 => sum12 += Array(i, j, k)
case _ =>
}
}
val total = math.pow(6, 3)
sum11.length / total
sum12.length / total
}


## Hoefding’s inequality

Sample mean is unlikely to far from true mean when sample size is large.

, where $\nu$ is the sample mean and $\mu$ is the true mean. In other words, $\nu$ is probably approximately correct (PAC).

When applied to machine learning, this means:

## Elements

We have

• Unknown target function (underlying the data)
• Training examples (the data)
• Hypothesis set (possible approximations to the target function)
• Learning algorithm (for choosing the “best” from the hypothesis set)
• Final hypothesis (result of learning)

## Perceptrons

### Hypothesis in the vector form

$x$ is a vector here.

### Perceptron Learning Algorithm (PLA)

$w, x$ are vectors here.

1. Start with an arbitrary $w_0$, say $\textbf{0}$.
2. On the first case where $\text{sign}(w_t^T x_n) \ne y_n$, update $w$ like this: . This works because , or , or, which guarantees that the iteration of $w$ is in the direction of $y_n$.
3. Continue iterating until no mistake occurs for any $(x_n, y_n)$ in a full loop.

• Will it ever stop?
• When is stops, will the result be close to the target function (unknown)?
• Will it ever make a mistake (inside/outside your dataset)?

### Linear separability

PLA does not always halt:

### Solution guaranteed when the linear separable

Agreement between prediction and existing data can be summarized by:

, where $w_f$ is the "true and unknown" weight vector

Since $w_f^T x_n$ always has the same sign as $y_n$, this should be non-negative.
Hence existence of solution means that $\min_n y_n w_f^T x_n > 0$.

How do we know the $w_t$ in PLA will get close enough to $w_f$?

We can see the inner product $w_f^T w_t$ gets larger as $t$ grows, this is a good sign, but we still need to check whether this is because they become more and more similar in direction or because $w_t$ becomes larger in magnitude.

Since $t$ only grows when there is a mistake, we know $2y_n w_t^T x_n \le 0$, hence

, where $R^2$ is the max magnitude of all possible $\|x_n\|^2$. At least only limited growth is seen in $w_t$.

If we assume $w_0$ = 0, using telescope technique we can get
$\|w_t\|^2 \le tR^2$
, or
$\|w_t\| \le \sqrt{tR^2}$
If we let

, then

Using telescope collapsing we get .

To eliminate influence from magnitude we can normalize both $w_f$ and $w_t$ and check their inner product:

So, indeed, $w_f$ and $w_t$ are getting closer and closer to each other in direction. Let

.

And since $\frac{w_f}{\|w_f\|} \cdot \frac{w_t}{\|w_t\|}$ converges to 1, we know eventually

So, not only do we know that PLA will find the solution (provided there is a solution), we also know it will find it in so many (finite) steps.

## Use apache spark in intellij

Add this line in your build.sbt:

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2"


Do something like this:

object Script3 {
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
// local[4] means that you want spark to run locally with 4 threads
// you can use a cluster when your app is production ready, of course
val conf = new SparkConf().setAppName("appspark").setMaster("local[4]")
val sc = new SparkContext(conf)
val lines = sc.textFile(getClass.getResource("/mtcars.txt").toString)
val lineLengths = lines.map(x => x.length)
val totalLength = lineLengths.reduce(_ + _)
}

object Script4 {
import Script3._

// spark logs are too verbose by default
// i only want to messages when there is something wrong
import org.apache.log4j.Logger
import org.apache.log4j.Level
Logger.getLogger("org").setLevel(Level.WARN)
Logger.getLogger("akka").setLevel(Level.WARN)
println(lines)
}


## Visualize planet orbits with three.js

<!--
! Excerpted from "3D Game Programming for Kids",
! Copyrights apply to this code. It may not be used to create training material,
! courses, books, articles, and the like. Contact us if you are in doubt.
! We make no guarantees that this code is fit for any purpose.
! Visit http://www.pragmaticprogrammer.com/titles/csjava for more book information.
-->
<body></body>
<!--<script src="http://gamingJS.com/Three.js"></script>-->
<!--<script src="http://gamingJS.com/ChromeFixes.js"></script>-->
<script src="js/three.min.js"></script>
<script>
// This is where stuff in our game will happen:
var scene = new THREE.Scene();

// This is what sees the stuff:
var aspect_ratio = window.innerWidth / window.innerHeight;
var above_cam = new THREE.PerspectiveCamera(75, aspect_ratio, 1, 1e6);
above_cam.position.z = 1000;

var earth_cam = new THREE.PerspectiveCamera(75, aspect_ratio, 1, 1e6);

var camera = above_cam;

// This will draw what the camera sees onto the screen:
var renderer = new THREE.WebGLRenderer();
renderer.setSize(window.innerWidth, window.innerHeight);
document.body.appendChild(renderer.domElement);

// ******** START CODING ON THE NEXT LINE ********
document.body.style.backgroundColor = 'black';

var surface = new THREE.MeshPhongMaterial({ambient: 0xFFD700});
var star = new THREE.SphereGeometry(50, 28, 21);
var sun = new THREE.Mesh(star, surface);

var ambient = new THREE.AmbientLight(0xffffff);

var sunlight = new THREE.PointLight(0xffffff, 15, 1000, 1);

var surface = new THREE.MeshPhongMaterial({ambient: 0x1a1a1a, color: 0x0000cd});
var planet = new THREE.SphereGeometry(20, 120, 115);
var earth = new THREE.Mesh(planet, surface);
earth.position.set(250, 0, 0);

var surface = new THREE.MeshPhongMaterial({ambient: 0x1a1a1a, color: 0xb22222});
var planet = new THREE.SphereGeometry(20, 120, 115);
var mars = new THREE.Mesh(planet, surface);
mars.position.set(500, 0, 0);

clock = new THREE.Clock();

function animate() {
requestAnimationFrame(animate);

var time = clock.getElapsedTime();

var e_angle = time * 0.8;
earth.position.set(250 * Math.cos(e_angle), 250 * Math.sin(e_angle), 0);

var m_angle = time * 0.3;
mars.position.set(500 * Math.cos(m_angle), 500 * Math.sin(m_angle), 0);

var y_diff = mars.position.y - earth.position.y,
x_diff = mars.position.x - earth.position.x,
angle = Math.atan2(x_diff, y_diff);

// camera faces the same direction as Z-axis by default
earth_cam.rotation.set(Math.PI / 2, -angle, 0);
earth_cam.position.set(earth.position.x, earth.position.y, 22);

// Now, show what the camera sees on the screen:
renderer.render(scene, camera);
}

animate();

var stars = new THREE.Geometry();
while (stars.vertices.length < 1e4) {
var lat = Math.PI * Math.random() - Math.PI / 2;
var lon = 2 * Math.PI * Math.random();

stars.vertices.push(new THREE.Vector3(
1e5 * Math.cos(lon) * Math.cos(lat),
1e5 * Math.sin(lon) * Math.cos(lat),
1e5 * Math.sin(lat)
));
}
var star_stuff = new THREE.ParticleBasicMaterial({size: 500});
var star_system = new THREE.ParticleSystem(stars, star_stuff);

var code = event.keyCode;

if (code == 65) { // A
camera = above_cam;
}
if (code == 69) { // E
camera = earth_cam;
}
});

</script>


## Javascript execution weird

var mydata = []
d3.csv("https://raw.githubusercontent.com/kindlychung/cytob/master/data/hg19.csv", function (err, data) {
var data1 = data.filter(function (d) {
return d.chr == "chr22";
})
var innerdata = [];
for(var i = 0; i < data1.length; i++) {
mydata.push(data1[i].cyto);
innerdata.push(data1[i].cyto);
}
console.log(innerdata);
console.log("inside", mydata);
})
console.log("outside", mydata);


Result:

outside []
test.js:11 ["p13", "p12", "p11.2", "p11.1", "q11.1", "q11.21", "q11.22", "q11.23", "q12.1", "q12.2", "q12.3", "q13.1", "q13.2", "q13.31", "q13.32", "q13.33"]
test.js:12 inside ["p13", "p12", "p11.2", "p11.1", "q11.1", "q11.21", "q11.22", "q11.23", "q12.1", "q12.2", "q12.3", "q13.1", "q13.2", "q13.31", "q13.32", "q13.33"]


It’s strange that the outside log is actually executed before the inside log.

## The basics

d3.select("body").append("svg")
.attr("width", 50)
.attr("height", 50)
.append("circle")
.attr("cx", 25)
.attr("cy", 25)
.attr("r", 25)
.style("fill", "purple")

var theData = [ 1, 2, 3 ]
var p = d3.select("body").selectAll("p")
.data(theData)
.enter()
.append("p")
.text("Hello");
//text(function () {
//    return "hello world!";
//});
//.text(function (d) {
//    return d * 2
//});
//.text(function (d, i) {
//    return "Index: " + i + ", Val: " + d;
//});

console.log(p)

## Binding shape properties to data

circleRadii = [40, 20, 10]

var svgContainer = d3.select("body").append("svg")
.attr("width", 600)
.attr("height", 100);

var circles = svgContainer.selectAll("circle")
.enter()
.append("circle")

var circleAttributes = circles
.attr("cx", 50)
.attr("cy", 50)
.attr("r", function (d) { return d; })
.style("fill", function(d) {
var returnColor;
if (d === 40) { returnColor = "green";
} else if (d === 20) { returnColor = "purple";
} else if (d === 10) { returnColor = "red"; }
return returnColor;
});

## Binding styles and coordinates to data

var spaceCircles = [30, 70, 110];

var svgContainer = d3.select("body").append("svg")
.attr("width", 200)
.attr("height", 200);

var circles = svgContainer.selectAll("circle")
.data(spaceCircles)
.enter()
.append("circle");

var circleAttributes = circles
.attr("cx", function (d) { return d; })
.attr("cy", function (d) { return d; })
.attr("r", 20 )
.style("fill", function(d) {
var returnColor;
if (d === 30) { returnColor = "green";
} else if (d === 70) { returnColor = "purple";
} else if (d === 110) { returnColor = "red"; }
return returnColor;
});

## Using json object as data

var jsonCircles = [
{
"x_axis": 30,
"y_axis": 30,
"color" : "green"
}, {
"x_axis": 70,
"y_axis": 70,
"color" : "purple"
}, {
"x_axis": 110,
"y_axis": 100,
"color" : "red"
}];

var svgContainer = d3.select("body").append("svg")
.attr("width", 200)
.attr("height", 200);

var circles = svgContainer.selectAll("circle")
.data(jsonCircles)
.enter()
.append("circle");

var circleAttributes = circles
.attr("cx", function (d) { return d.x_axis; })
.attr("cy", function (d) { return d.y_axis; })
.attr("r", function (d) { return d.radius; })
.style("fill", function(d) { return d.color; });

### Polylines

//The data for our line
var lineData = [ { "x": 1,   "y": 5},  { "x": 20,  "y": 20},
{ "x": 40,  "y": 10}, { "x": 60,  "y": 40},
{ "x": 80,  "y": 5},  { "x": 100, "y": 60}];

//This is the accessor function we talked about above
var lineFunction = d3.svg.line()
.x(function(d) { return d.x; })
.y(function(d) { return d.y; })
.interpolate("linear");

//The SVG Container
var svgContainer = d3.select("body").append("svg")
.attr("width", 200)
.attr("height", 200);

//The line SVG Path we draw
var lineGraph = svgContainer.append("path")
.attr("d", lineFunction(lineData))
.attr("stroke", "blue")
.attr("stroke-width", 2)
.attr("fill", "none");

## Dynamically determine size of svg

var jsonRectangles = [
{ "x_axis": 10, "y_axis": 10, "height": 20, "width":20, "color" : "green" },
{ "x_axis": 160, "y_axis": 40, "height": 20, "width":20, "color" : "purple" },
{ "x_axis": 70, "y_axis": 70, "height": 20, "width":20, "color" : "red" }];

var max_x = 0;
var max_y = 0;

for (var i = 0; i < jsonRectangles.length; i++) {
var temp_x, temp_y;
var temp_x = jsonRectangles[i].x_axis + jsonRectangles[i].width;
var temp_y = jsonRectangles[i].y_axis + jsonRectangles[i].height;

if ( temp_x >= max_x ) { max_x = temp_x; }

if ( temp_y >= max_y ) { max_y = temp_y; }
}

var svgContainer = d3.select("body").append("svg")
.attr("width", max_x)
.attr("height", max_y)

var rectangles = svgContainer.selectAll("rect")
.data(jsonRectangles)
.enter()
.append("rect");

var rectangleAttributes = rectangles
.attr("x", function (d) { return d.x_axis; })
.attr("y", function (d) { return d.y_axis; })
.attr("height", function (d) { return d.height; })
.attr("width", function (d) { return d.width; })
.style("fill", function(d) { return d.color; });

## Linear scale

var initialScaleData = [0, 1000, 3000, 2000, 5000, 4000, 7000, 6000, 9000, 8000, 10000];

var newScaledData = [];
var minDataPoint = d3.min(initialScaleData);
var maxDataPoint = d3.max(initialScaleData);

var linearScale = d3.scale.linear()
.domain([minDataPoint,maxDataPoint])
.range([0,100]);

for (var i = 0; i < initialScaleData.length; i++) {
newScaledData[i] = linearScale(initialScaleData[i]);
}

newScaledData;
//[0, 10, 30, 20, 50, 40, 70, 60, 90, 80, 100]

## Transformation

var circleData = [
{ "cx": 20, "cy": 20, "radius": 20, "color" : "green" },
{ "cx": 70, "cy": 70, "radius": 20, "color" : "purple" }];

var rectangleData = [
{ "rx": 110, "ry": 110, "height": 30, "width": 30, "color" : "blue" },
{ "rx": 160, "ry": 160, "height": 30, "width": 30, "color" : "red" }];

var svgContainer = d3.select("body").append("svg")
.attr("width",200)
.attr("height",200);

var circleGroup = svgContainer.append("g")
.attr("transform", "translate(80,0)")

var circles = circleGroup.selectAll("circle")
.data(circleData)
.enter()
.append("circle");

var circleAttributes = circles
.attr("cx", function (d) { return d.cx; })
.attr("cy", function (d) { return d.cy; })
.attr("r", function (d) { return d.radius; })
.style("fill", function (d) { return d.color; });

var rectangles = svgContainer.selectAll("rect")
.data(rectangleData)
.enter()
.append("rect");

var rectangleAttributes = rectangles
.attr("x", function (d) { return d.rx; })
.attr("y", function (d) { return d.ry; })
.attr("height", function (d) { return d.height; })
.attr("width", function (d) { return d.width; })
.style("fill", function(d) { return d.color; });

//Circle Data Set
var circleData = [
{ "cx": 20, "cy": 20, "radius": 20, "color" : "green" },
{ "cx": 70, "cy": 70, "radius": 20, "color" : "purple" }];

//Create the SVG Viewport
var svgContainer = d3.select("#svgContainer")
.attr("width",200)
.attr("height",200);

//Add the SVG Text Element to the svgContainer
var text = svgContainer.selectAll("text")
.data(circleData)
.enter()
.append("text");

var circles = svgContainer.selectAll("circle")
.data(circleData)
.enter()
.append("circle")
.attr("cx", function(d) {return d.cx})
.attr("cy", function(d) {return d.cy})
.attr("fill", function(d) {return d.color})

var textLabels = text
.attr("x", function(d) { return d.cx; })
.attr("y", function(d) { return d.cy; })
.text( function (d) { return "( " + d.cx + ", " + d.cy +" )"; })
.attr("font-family", "sans-serif")
.attr("font-size", "20px")
.attr("fill", "red");

## Thinking in data joins: update, enter, exit


var width = 960,
height = 500;

var svg = d3.select("body").append("svg")
.attr("width", width)
.attr("height", height)
.append("g")
.attr("transform", "translate(32," + (height / 2) + ")");

function update(data) {

// DATA JOIN
// Join new data with old elements, if any.
var text = svg.selectAll("text")
.data(data);

// UPDATE
// Update old elements as needed.
// Initially this part is empty
text.attr("class", "update").attr("fill", "blue");

// ENTER
// Create new elements as needed.
text.enter().append("text")
.attr("class", "enter")
.attr("fill", "green")
.attr("x", function(d, i) { return i * 32; })
.attr("dy", ".35em");

// ENTER + UPDATE
// Appending to the enter selection expands the update selection to include
// entering elements; so, operations on the update selection after appending to
// the enter selection will apply to both entering and updating nodes.
text.text(function(d) { return d; });

// EXIT
// Remove old elements as needed.
text.exit().remove();
}

In the javascript console:

update([1, 2])

update([1, 2, 3, 4])

update([1, 2, 3, 4, 5, 6])

Now with a touch of animation (transition in coordinates):

var width = 960,
height = 500;

var svg = d3.select("body").append("svg")
.attr("width", width)
.attr("height", height)
.append("g")
.attr("transform", "translate(32," + (height / 2) + ")");

function update(data) {
var text = svg.selectAll("text")
.data(data);
text.attr("class", "update").attr("fill", "blue");
text.enter().append("text")
.attr("class", "enter")
.attr("fill", "green")
.attr("x", function(d, i) { return i * 32; })
.attr("y", -50)
.transition()
.attr("y", 0)
.attr("dy", ".35em");
text.text(function(d) { return d; });
text.exit()
.attr("y", 0)
.transition()
.attr("y", -50)
.remove();
}

## Nested Selection

var tableBody = d3.select("body").append("table").append("tbody");
//var cells = tableBody.selectAll("tr").selectAll("td")
//    .style("color", function(d, i, j) {return i === j ? "green" : "lightblue"})

var matrix = [
[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
];
var cells = tableBody.selectAll("tr")
.data(matrix)
.enter()
.append("tr")
.selectAll("td")
.data(function(d, i) {return d})
.enter()
.append("td")
.html(function(d) {return "<b>" + d + "</b>";});