Posts

2021

How to disable auto strip in Charfield in Django

19 December 2021·2 mins

In Django, when edit field in admin page or post data to forms, the leading and tailing whitespace in CharField and TextField are removed. The reason is strip=True parameter in forms.CharField, which is added in Djagno 1.9. You can see the discussion in django tiket #4960 and here is source code.

Using JSONField before Django 3.1

11 September 2021·2 mins

In Django 3.1, Django support save python data into database as JSON encoded data and it is also possible to make query based on field value in JSONField. The detailed usage can be found here. If you are using older version and want to try this feature.

Dynamic Allocate Executors when Executing Jobs in Spark

18 July 2021·4 mins

I wrote a Spark program to process logs. The number of logs always changes as time goes by. To ensure logs can be processed instantly, the number of executors is calculated by the maximum of logs per minutes. As a consequence, the CPU usage is low in executors.

Improve Kafka throughput

28 May 2021·5 mins

Kafka is a high-performance and scalable messaging system. Sometimes when handling big data. The default configuration may limit the maximum performance. In this article, I’ll explain how messages are generate and saved in Kafka, and how to improve performance by changing configuration.

Fix Error: Cask 'java' is unavailable in Homebrew

7 March 2021·1 min

After update brew to latest version, when calling cask related command, it always outputs Error: Cask 'java' is unavailable: No Cask with this name exists., such as brew list --cask. However, the brew command works. After doing some research, I found Java has been moved to homebrew/core.

2020

Timezone in JVM

18 October 2020·2 mins

I wrote a Scala code to get the current time. However, the output is different on the development server and docker. import java.util.Calendar println(Calendar.getInstance().getTime) On my development server, it outputs Sun Oct 18 18:01:01 CST 2020, but in docker, it print a UTC time.

Using cibuildwheel to Create Python Wheels

29 July 2020·2 mins

Have you ever tried to install MySQL-python? It contains the C code and need to compile the code while install the package. You have to follow the steps in this articles: Install MySQL and MySQLClient(Python) in MacOS. Things get worse if you are using Windows.

Retrieve Large Dataset in Elasticsearch

21 June 2020·5 mins

It’s easy to get small dataset from Elasticsearch by using size and from. However, it’s impossible to retrieve large dataset in the same way. Deep Paging Problem #As we know it, Elasticsearch data is organised into indexes, which is a logical namespace, and the real data is stored into physical shards.

Program Crash Caused by CPU Instruction

17 May 2020·3 mins

It’s inevitable to dealing with bugs in coding career. The main part of coding are implementing new features, fixing bugs and improving performance. For me, there are two kinds of bugs that is difficult to tackle: those are hard to reproduce, and those occur in code not wrote by you.

C-m, RET and Return Key in Emacs

11 April 2020·2 mins

I use Emacs to write blog. In the recent update, I found M-RET no longer behave as leader key in org mode, but behave as org-meta-return. And even more strange is that in other mode, it behave as leader key. And M-RET also works in terminal in org mode.