Django解决distinct无法去除重复数据的问题--688IT编程网

Django解决distinct⽆法去除重复数据的问题

使⽤distinct在mysql中查询多条不重复记录值的解决办法

如何使⽤distinct在mysql中查询多条不重复记录值?

有时候想⽤distinct去掉queryset中的重复项，看django⽂章中是这么说的

>>> Author.objects.distinct()

[...]

>>> der_by('pub_date').distinct('pub_date')

[...]

>>> der_by('blog').distinct('blog')

[...]

>>> der_by('author', 'pub_date').distinct('author', 'pub_date')

mysql删除重复的数据保留一条[...]

>>> der_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date')

[...]

>>> der_by('author', 'pub_date').distinct('author')

[...]

Note

django⽂档中特别介绍了，distinct的列⼀定要先order_by并且在第⼀项。

When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.

For example, SELECT DISTINCT ON (a) gives you the first row for each value in column a. If you don't specify an order, you'll get some arbitrary row.

完全照做，⽤的mysql数据库最后出现了这样的警告：

raise NotImplementedError('DISTINCT ON fields is not supported by this database backend') NotImplementedError: DISTINCT ON fields is not supported by this database backend

告诉我数据库不⽀持。

当然可以这样：

items = []

for item in query_set:

if item not in items:

items.append(item)

⾸先，我们必须知道在django中模型执⾏查询有两种⽅法:

第⼀种，使⽤django给出的api，例如filter value distinct order_by等模型查询api;

代码:LOrder.objects.values('finish_time').distinct()

这⾥应注意，原官⽅⽂档中写到:

⽰例（第⼀个之后的⽰例都只能在PostgreSQL 上⼯作）：

>>> Author.objects.distinct() [...] >>> der_by('pub_date').distinct('pub_date') [...] >>>

der_by('blog').distinct('blog') [...] >>> der_by('author', 'pub_date').distinct('author', 'pub_date') [...] >>> der_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date') [...] >>> der_by('author', 'pub_date').distinct('author')

因为我使⽤的mysql数据库，所以在distinct只能是第⼀中⽤法，或者可以这样⽤

LOrder.objects.values('finish_time').distinct().order_by('finish_time')

第⼆种，使⽤原始SQL查询

LOrder.objects.raw('SELECT DISTINCT id,finish_time FROM keywork_lorder group by finish_time')

上⾯直接使⽤mysql语句进⾏剔重，这⾥需要特别注意的是:

⼀是原始SQL查询只有⼀种字段不可以被丢掉，官⽅⽂档中这样说道:

只有⼀种字段不可以被省略——就是主键。 Django 使⽤主键来识别模型的实例，所以它在每次原始查询中都必须包含。如果你忘记包含主键的话，会抛出⼀个InvalidQuery异常。

意思是，如果你的sql语句是这样的'SELECT DISTINCT finish_time FROM keywork_lorder '，那么将会报错Raw query must include the primary key，就是id字段不能被丢掉!

⼆是，这⾥是原始mysql查询语句,mysql去掉重复项要这样写:'SELECT DISTINCT id,finish_time FROM keywork_lorder group by finish_time'

补充：使⽤Distinct去除重复数据

distinct⽤于在查询中返回列的唯⼀不同值（即去重复），⽀持单列或多列。

在实际的应⽤中，表中的某⼀列含有重复值是很常见的，如employee员⼯表的dept部门列。

如果在查询数据时，希望得到某列的所有不同值，可以使⽤distinct。

distinct 语法

select 【distinct】 column_name1,column_name2

from table_name;

下⾯开始操作

创建⼀个⾜迹表

create table footprint(

id int not null auto_increment primary key,

username varchar(30) comment '⽤户名',

city varchar(30) comment '城市',

visit_date varchar(10) comment '到访⽇期'

);

插⼊⼀些数据

insert into footprint(username, city, visit_date) values('mofei', '贵阳', '2019-12-05');

insert into footprint(username, city, visit_date) values('mofei', '贵阳', '2020-01-15');

insert into footprint(username, city, visit_date) values('mofei', '北京', '2018-10-10');

insert into footprint(username, city, visit_date) values('zhangsan', '上海', '2020-01-01');

insert into footprint(username, city, visit_date) values('zhangsan', '上海', '2020-02-02');

insert into footprint(username, city, visit_date) values('lisi', '拉萨', '2016-12-20');

这些⽤户到访过那些城市

mysql> select distinct city from footprint;

和group by 效果相同，只不过distinct专门负责去重复这个活

mysql> select city from footprint group by city;

查询有⼏个⽤户在使⽤系统

mysql> select distinct username from footprint;

dictinct作⽤于两个字段时，多条数据都相同时会保留⼀条

以上内容来⾃墨菲墨菲的补充

补充知识：Distinct和Group by去除重复字段记录

重复记录有两个意义，⼀是完全重复的记录，也即所有字段均重复的记录

⼆是部分关键字段重复的记录，⽐如Name字段重复，⽽其他字段不⼀定重复或都重复可以忽略。

1、对于第⼀种重复，⽐较容易解决，使⽤

select distinct * from tableName

就可以得到⽆重复记录的结果集。

如果该表需要删除重复的记录(重复记录保留1条)，可以按以下⽅法删除

select distinct * into #Tmp from tableName

drop table tableName

select * into tableName from #Tmp

drop table #Tmp

发⽣这种重复的原因是表设计不周产⽣的，增加唯⼀索引列即可解决。

2、这类重复问题通常要求保留重复记录中的第⼀条记录，操作⽅法如下

假设有重复的字段为Name,Address，要求得到这两个字段唯⼀的结果集

select identity(int,1,1) as autoID, * into #Tmp from tableName

select min(autoID) as autoID into #Tmp2 from #Tmp group by Name

select * from #Tmp where autoID in(select autoID from #tmp2)

最后⼀个select即得到了Name，Address不重复的结果集(但多了⼀个autoID字段，实际写时可以写在select⼦句中省去此列)其它的数据库可以使⽤序列，如：

create sequence seq1;

val as autoID, * into #Tmp from tableName

zuolo: 我根据上⾯实例得到所需要的语句为 SELECT MAX(id) AS ID,Prodou_id,FinalDye blDBDdata GROUP BY Prodou_id,FinalDye ORDER BY id，之前⼀直想⽤Distinct来得到指定字段不重复的记录是个误区。

以上这篇Django 解决distinct⽆法去除重复数据的问题就是⼩编分享给⼤家的全部内容了，希望能给⼤家⼀个参考，也希望⼤家多多⽀持。

688IT编程网

Django解决distinct无法去除重复数据的问题

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

688IT编程网

Django解决distinct无法去除重复数据的问题

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法 正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式 最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

nginx map用法正则

shell 正则表达式最后一行