毕业设计时简单研究了聚类和分类问题,整理了一下用到的数据集,有需要的可以参考一下。。。

聚类数据集信息

序号数据集记录数特征数类别简单分布是否有overlap来源
1iris1504350/50/50NoUCI
2wine17813359/71/48NoUCI
3emotions(music)593726173/166/264/148/168/189YESsourceforge
4yeast241710314混合分布YESsourceforge
5scene24072946427/364/397/433/533/431YESsourceforge
6wdbc569302212/357NoUCI
7breasttissue1069621/15/18/16/14/22NoUCI
8seeds2107370/70/70NoUCI
9glass21496(7)70/76/17/13/9/29NoUCI

分类数据集信息

序号数据集记录数特征数类别简单分布是否有缺失值来源
1appendicitis1067221/85NoKEEL
2balance62543288/49/288NoKEEL,UCI
3banana5300222924/2376NoKEEL
4bands365(539)192230/135YesKEEL,UCI
5bupa34562145/200NoKEEL,UCI
6cleveland297(303)135160/54/35/35/13YesKEEL,UCI
7dermatology358(366)346111/60/71/48/48/20YesKEEL,UCI
8haberman30632225/81NoKEEL,UCI
9hayes-roth1604365/64/31NoKEEL,UCI
10heart270132150/120NoKEEL,UCI
11hepatitis80(155)19213/67YesKEEL,UCI
12ionosphere351342225/126NoKEEL,UCI
13iris1504350/50/50NoKEEL,UCI
14led7digit50071045/37/51/57/52/52/47/57/53/49NoKEEL,UCI
15mammographic830(961)52427/403NoKEEL,UCI
16marketing6876(8993)1391255/529/505/618/527/846/784/1069/743YesKEEL,biolab
17monks243272290/142NoKEEL,UCI
18movement_libras360901524/…/24NoKEEL,UCI
19newthyroid21553150/35/30NoKEEL,UCI
20pageblocks54731054913/329/28/88/115NoKEEL,UCI
21penbased100921610NoKEEL,UCI
22phoneme5404523818/1586NoKEEL,UCL
23pima76882500/268NoKEEL,UCI
24ring74002023664/3736NoKEEL,UTO
25satimage64353671533/703/1358/626/707/0/1508NoKEEL,UCI
26segment2310197330/…/330NoKEEL,UCI
27sonar20860297/111NoKEEL,UCI
28spambase4597(4601)5722788/1813YesKEEL,UCI
29spectfheart26744255/212NoKEEL,UCI
30tae1515349/50/52NoKEEL,UCI
31texture55004011500/…/500NoKEEL,UCL
32thyroid7200213166/368/6666NoKEEL,UCI
33titanic2201321490/711NoKEEL,TOR
34twonorm74002023703/3697NoKEEL,UTO
35vehicle846184212/218/199/217NoKEEL,UCI
36vowel990131190/…/90NoKEEL,UCI
37wdbc569302212/357NoUCI
38wine17813359/71/48NoUCI
39winequality-red1599111110/53/681/638/199/18NoKEEL,UCI
40wisconsin683(699)92444/239NoKEEL,UCI
Logo

旨在为数千万中国开发者提供一个无缝且高效的云端环境,以支持学习、使用和贡献开源项目。

更多推荐