科学计算用 Python 大家都习以为常,小朋友们都不记得多少年前它只是个小众语言。

感谢UCLA 周华 老师的推荐,最近算法研究中我们开始使用 Julia,发现它的 Performance 很惊人,和 C 差不多,但写起来像 Python 一样容易。Julia 还可直接调用 SIMD 或 AVX 来向量化加速,最后2张 native code(汇编语言)比较可以看到 SIMD 版使用了4个寄存器XMM0-3,而普通版只用了一个 XMM0。

有谁知道 Jupyter notebook 是哪三个语言的缩写吗?

最近发布了内部使用的适用超算的大规模分布式计算库,Pegaflow。We recently released our internal HPC (high-performance computing) workflow package, Pegaflow, to the public. It allows a developer to connect dependent computing jobs into a DAG (Directed Acyclic Graph) and run the DAG on a computing cluster (rather a single node) in parallel.

It differs from Pegasus's own python2 API.

  1. Pegaflow is Python3-only.
  2. An abstract class that simplifies workflow coding significantly, for OO (Object-Oriented) programmers.
  3. APIs in that simplifies workflow coding for non-OO programmers.

The ready-to-install pip package is at Install via "pip3 install pegaflow".

Code is at

For examples, check


又到新年party的时间了,感谢各位同学过去一年的付出, 实验室在算法模型开发、数据库建设等都有长足进展。 期待2020我们发现新的pattern, 新的模型。

PS: 某同学的poker概率功夫有待提高 :)

读博士时开始合作的好朋友严锡峰教授(美国加州大学圣塔芭芭拉分校计算机系终身教授, 2019年美国工业与应用数学会数据挖掘国际会议主席)受邀来作深度学习的讲座。他团队研发出来的DeepMotif比经典算法MEME快1万倍;时间序列预测算法AGA(Attention-Guided Autoregression)取得了当前最好的预测准确度,击败了Amazon的DeepAR和Google的Transformer算法。感谢郑老师,殷老师,和各位同学百忙中聆听讲座,以后多多交流。

Xifeng Yan, Professor at UCSB Computer Science, a friend and long-term collaborator, came to give a Talk: Deep Learning in Motif Finding and Time Series Forecasting.

DeepMotif, developed by his group, is 10,000 times faster than MEME. AGA (Attention-Guided Autoregression), a time-series prediction algorithm, also by his group, is currently the best prediction algorithm, beating DeepAR of Amazon, Transformer of Google. Thanks to the colleagues and students that find time to attend this lecture!

Xifeng Yan, Venkatesh Narayanamurti Chair of Computer Science, University of California at Santa Barbara

Location: 4-426 at Haike 501 (conference room on the east side)

Time: 2:30pm, Thursday, Oct 17th, 2019


Sequences and time series data are ubiquitous. In this talk, I will introduce the recent advance we achieved in analyzing sequential data, specifically, motif finding and time series forecasting. Traditional statistical models often have to sacrifice speed for accuracy in order to handle large volumes of sequential data generated by new devices. I will discuss how to leverage deep learning techniques to solve this issue and make new breakthroughs in these two areas.


Xifeng Yan is a professor at the University of California at Santa Barbara, holding the Venkatesh Narayanamurti Chair of Computer Science. He received his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign in 2006 and was a research staff member at the IBM T. J. Watson Research Center between 2006 and 2008. His work is centered on modeling, mining, and searching data. His contribution can be found in data mining, database systems, natural language processing, etc. His works were extensively referenced, with over 18,000 citations per Google Scholar and thousands of software downloads. He received NSF CAREER Award, IBM Invention Achievement Award, ACM-SIGMOD Dissertation Runner-Up Award, and IEEE ICDM 10-year Highest Impact Paper Award.