- 引用关系不能反映所引论文的重要程度,实际在论文中不同引用的份量彼此差异很大;
- 在不同学科间,论文引用数据量不具比较性;
- 对于突破性的论文,因为在其早期研究圈子较小,所以通常该论文被引数量较少;
- 论文的研究工作一旦被课本绍后,通常论文的引用会停止或大幅减少。
慧> You believe it, you understand it.
诺> 彼此尊重、彼此信任、彼此参与、彼此承诺
德> 知死而智生
道> 非攻节用,乘物游心,我自然
Sergei Maslov等在论文中介绍了一种使用PageTanke的论文评价方法,然而更令我感兴趣的是其对引用分析问题的几点归纳:
为与Wikipedia竞争,Google推出Knol产品,并从2008年7月份开始运营,至今已有六个月。最近,据Goolge宣称Knol的文章总量已达到10万份,尽管离英文维基的271万份相差很远,但就6个月时间来说已是相当不容易。值得注意,人们曾担心的Knol运营模式,这个问题也开始出现,作者署名及评审的机制并没有发挥预计的作用,似乎这一问题会继续纠缠Knol:阅读量低,内容质量低,重复内容不少,按照Nate Anderson 的分析:
Take "Barack Obama," for instance. A search for his name brings up 809 entries; since most Knol users appear to write their own entries rather than add to others (for which no compensation is forthcoming), the proliferation of entries is inevitable. And it's not at all clear that the best ones are rising to the top.
这是一份IBM所提交的专利申请(US 2007/0282849 A1),是关于trim()的,就是那种去掉首尾空白字符的技术。估计多数程序员都了解trim()或者strip(),也正为大家都清楚该“技术”,所以这份专利申请也就成为一份极有价值的“专利申请模板”,如何将几句话交代清楚的技术,写成一份14页的专利申请,而且有20个claims。
Compressed Sensing
维基百科这样定义Compressed Sensing:
有新东西学啦,有兴趣请从Terence Tao博客上的介绍开始。
这里也有Tao的Compressed Sensing讲座,共7段.
Compressed sensing is a technique for acquiring and reconstructing a signal utilizing the prior knowledge that it is sparse or compressible.
The main idea behind compressed sensing is to exploit that there is some structure and redundancy in most interesting signals -- they are not pure noise. In particular, most signals are sparse, that is, they contain many coefficients close to or equal to zero, when represented in some domain.
有新东西学啦,有兴趣请从Terence Tao博客上的介绍开始。
The thing is that while the space of all images has 2MB worth of “degrees of freedom” or “entropy”, the space of all interesting images is much smaller, and can be stored using much less space, especially if one is willing to throw away some of the quality of the image.
What if the camera selects a completely different set of 100,000 (or 300,000) wavelets, and thus loses all the interesting information in the image?
The solution to this problem is both simple and unintuitive. It is to make 300,000 measurements which are totally unrelated to the wavelet basis - despite all that I have said above regarding how this is the best basis in which to view and compress images. In fact, the best types of measurements to make are (pseudo-)random measurements - generating, say, 300,000 random “mask” images and measuring the extent to which the actual image resembles each of the masks. Now, these measurements (or “correlations”) between the image and the masks are likely to be all very small, and very random. But - and this is the key point - each one of the 2 million possible wavelets which comprise the image will generate their own distinctive “signature” inside these random measurements, as they will correlate positively against some of the masks, negatively against others, and be uncorrelated with yet more masks.
But (with overwhelming probability) each of the 2 million signatures will be distinct; furthermore, it turns out that arbitrary linear combinations of up to a hundred thousand of these signatures will still be distinct from each other (from a linear algebra perspective, this is because two randomly chosen 100,000-dimensional subspaces of a 300,000 dimensional ambient space will be almost certainly disjoint from each other). Because of this, it is possible in principle to recover the image (or at least the 100,000 most important components of the image) from these 300,000 random measurements. In short, we are constructing a linear algebra analogue of a hash function.
这里也有Tao的Compressed Sensing讲座,共7段.
博文 (Atom)