Posts

2019

Different types of Attention

15 July 2019·1 min

\(s_t\) and \(h_i\) are source hidden states and target hidden state, the shape is (n,1). \(c_t\) is the final context vector, and \(\alpha_{t,s}\) is alignment score.

\[\begin{aligned} c_t&=\sum_{i=1}^n \alpha_{t,s}h_i \\ \alpha_{t,s}&= \frac{\exp(score(s_t,h_i))}{\sum_{i=1}^n \exp(score(s_t,h_i))} \end{aligned}\]

Global(Soft) VS Local(Hard) #

Global Attention takes all source hidden states into account, and local attention only use part of the source hidden states.

Load separate files #

data.Field parameters is here.

When calling build_vocab, torchtext will add <unk> in vocabulary list. Set unk_token=None if you want to remove it. If sequential=True (default), it will add <pad> in vocab. <unk> and <pad> will add at the beginning of vocabulary list by default.

Build Your Own Tiny Tiny RSS Service

10 June 2019·3 mins

After Inoreader change the free plan, which limit the max subscription to 150, I begin to find an alternative. Finally, I found Tiny Tiny RSS. It has a nice website and has the fever API Plugin which was supported by most of the RSS reader app, so you can read RSS on all of you devices.

Preview LaTeX in Org Mode with Emacs in MacOS

12 May 2019·1 min

Using the right Emacs Version #

I failed to preview LaTeX with emacs-plus. If you have installed d12frosted/emacs-plus, uninstall it and use emacs-mac.

brew tap railwaycat/emacsmacport
brew install emacs-mac

If you like the fancy spacemacs icon, install it with cask: brew cask install emacs-mac-spacemacs-icon

Using Dueling DQN to Play Flappy Bird

14 April 2019·5 mins

PyTorch provide a simple DQN implementation to solve the cartpole game. However, the code is incorrect, it diverges after training (It has been discussed here).

The official code’s training data is below, it’s high score is about 50 and finally diverges.

Circular Import in Python

10 March 2019·2 mins

Recently, I found a really good example code for Python circular import, and I’d like to record it here.

Here is the code:

1
2
3
4
5
6
7
8
# X.py
def X1():
    return "x1"

from Y import Y2

def X2():
    return "x2"

1
2
3
4
5
6
7
8
# Y.py
def Y1():
    return "y1"

from X import X1

def Y2():
    return "y2"

Guess what will happen if you run python X.py and python Y.py?

Python Dictionary Implementation

17 February 2019·3 mins

Overview #

CPython allocation memory to save dictionary, the initial table size is 8, entries are saved as <hash,key,value> in each slot(The slot content changed after Python 3.6).
When a new key is added, python use i = hash(key) & mask where mask=table_size-1 to calculate which slot it should be placed. If the slot is occupied, CPython using a probing algorithm to find the empty slot to store new item.
When 2/3 of the table is full, the table will be resized.
When getting item from dictionary, both hash and key must be equal.

Resizing #

When elements size is below 50000, the table size will increase by a factor of 4 based on used slots. Otherwise, it will increase by a factor of 2. The dictionary size is always \(2^{n}\).

2018

TextCNN with PyTorch and Torchtext on Colab

3 December 2018·3 mins

PyTorch is a really powerful framework to build the machine learning models. Although some features is missing when compared with TensorFlow (For example, the early stop function, History to draw plot), its code style is more intuitive.

Torchtext is a NLP package which is also made by pytorch team. It provide a way to read text, processing and iterate the texts.

CSRF in Django

7 November 2018·2 mins

CSRF(Cross-site request forgery) is a way to generate fake user request to target website. For example, on a malicious website A, there is a button, click it will send request to www.B.com/logout. When the user click this button, he will logout from website B unconsciously. Logout is not a big problem, but malicious website can generate more dangerous request like money transfer.

Create Node Benchmark in Py2neo

5 November 2018·2 mins

Recently, I’m working on a neo4j project. I use Py2neo to interact with graph db. Although Py2neo is a very Pythonic and easy to use, its performance is really poor. Sometimes I have to manually write cypher statement by myself if I can’t bear with the slow execution. Here is a small script which I use to compare the performance of 4 different ways to insert nodes.