2018百度之星程序设计大赛（资格赛）调查问卷 HDU6344 解题思路

暴力枚举, 剪枝 2019-01-18

2018百度之星程序设计大赛（资格赛）调查问卷 HDU6344 解题思路

2019-01-18

题目链接 HDU 6344 调查问卷

本人算法竞赛小菜鸡一只，这也是我第一篇博客，希望能和网上的各路大神分享自己的思路，在交流中不断进步！

话不多说，我来说说我对这道题目的思路：

1、题目分析

本题主要意思就是存在 n 份问卷，每份问卷存在 m 个问题，求出满足至少有 k 对问卷不同的问题子集个数。首先我们看输入的数量级，1≤m≤10 , 可以知道枚举问题子集的最多个数为 2^10=1024，状态空间数量很少，所以不难想出，大体思路如下：我们可以通过枚举子集然后判断其是否满足答案条件来进行统计即可得出答案。

2、细节思路

上面分析了主要思路就是通过枚举问题的所有子集，对每次情况判断之后进行统计。

枚举思路：我采用的是刘汝佳紫书里面的增量式递归枚举，枚举不要的问题集合

判断思路：我们对一个问题子集，统计出一共有多少种不同的问卷，以及有每种问卷都有多少份。此时计算公式如下：

不同的问卷对数$k=\sum (number[i]*number[j])$ 其中 i、j分别表示问卷的种数，number[i],number[j]分别表示i，j问卷的份数。例如：问卷集合为 AAA,BBB,AAA,ABC,BBB，则number[‘AAA’] = 2,number[BBB]=2,number[ABC]=1

k=number[‘AAA’]number[‘BBB’]+number[‘AAA’]number[‘ABC’]+number[‘BBB’]*number[‘ABC’]

那么现在的问题是对于一个问题集合：如何统计有多少种不同的问卷答案呢？我的思路如下：1、由于问题答案只有‘A’，‘B’两种答案，所以我们可以采用二进制编码，每份问卷的二进制编码即为该份问卷ID，例如 AAABBB，即编码为000111，编码过程中，如果该问题被删除，我们则不对该位进行编码，跳过即可。编码函数如下:

//获得索引为 index份 问卷的id
inline int get_id(int index)
{
	int id = 0;
	for (int i = 0; i < m; i++)
		if (!is_del[i])  id = id << 1 | (Q[index][i] - 'A');
	return id;
}

当两份问卷ID相同时，即代表这两份问卷是相同的。这样我们所有的问题ID为 0000000000———————1111111111 最多为2^10=1024个状态，用一个数组int my_set[1 << 12] 来保存，其中 my_set[i] 表示ID=i 的问卷个数。

我们现在已经统计完成每个ID的问卷份数,，只要对其运用上面说的公式求和，之后判断是否满足条件即可，代码如下

int ans = 0;
for (int i = 0; i<count - 1; i++)
	for (int j = i + 1; j < count; j++)
	{
		ans += my_set[exist[i]] * my_set[exist[j]];
		if (ans >= k)
			return true;
	}
return false;

3、算法设计

通过上面分析，大体思路我们已经了解，穷举+判断，继续深入想想，其实我们还可以对每种情况进行剪枝。剪枝情况如下：

1、如果当前枚举子集不符合条件，那么我们就不用继续枚举当前子集的子集了。例如当我们选择1,3,4,5,6问题时不满足情况，那么1,3,4,5,6情况的子集肯定也不满足情况，我们应当减去当前分支，直接返回。

2、对于n份问卷，最大的对数为 ,若 <k则不用判断直接输出0即可。

(我的想法其实还有一种最优性剪枝，但是加上之后不知道为什么答案是错的，求各路大神指点一下。)

思路如下：假设当前问题集合有 cur 个问题，那么当前问题集合能产生的不同问题种数最多为 2^cur 个，记2^cur = num_of_set。分为两种情况：

1、num_of_set > n 此时这种情况和上面剪枝的2情况一样，最优的是n份问卷每份问卷都不一样。

2、num_of_set < n 此时这种情况,按照我们上面的公式，应该是在每个 set 里面份数一样时 k 能取的最大值，例如假设 n可以整除num_of_set，如number_per_set = n/num_of_set = 2 ,则当每种问题中问卷份数都是一样时等于2时，k将取最大值，计算公式为:

k = 22 即总对数=选择两个不同集合第一个集合份数第二个集合份数 ,当集合份数相等时k取最大值。

若不能整除我就让 number_per_set = n / num_of_set + 1 计算 k ，若 k小于要求值，则剪枝.

具体剪枝代码如下:

bool need_cut(int cur)
{
	//最优性剪枝,当前最大能产生的 不同问卷的对数为
	int max_select = m - cur - 2;
	int max_set = pow(2, m - cur - 1);
	max_select = pow(4, max_select);
	//printf("max_select = %d\n", max_select);
	//
	if (n>max_set)
	{
		if ((max_select*(n / max_set+1)*(n / max_set+1)) < k)
			return true;
	}

	return false;
}

4、程序代码

其实这题不用剪枝也能AC，全部的代码如下：

#include <cstdio>
#include <cstring>
#include <set>
#include <algorithm>
#include <cmath>
using namespace std;
const int maxn = 1000 + 10;
const int maxquestion = 10 + 5;
int T;
int n, m, k;
char Q[maxn][maxquestion];
int P[maxquestion];
int is_del[maxquestion];
int select[maxquestion];
//保存每种情况有多少可能
int my_set[1 << 12];
int id[maxn];

int exist[maxn];


int Cn2;


//获得索引为 index份 问卷的id
inline int get_id(int index)
{
	int id = 0;
	for (int i = 0; i < m; i++)
		if (!is_del[i])  id = id << 1 | (Q[index][i] - 'A');
	return id;
}



//判断函数，cur表示当前有 cur-1 个 问题已经被丢弃，将要枚举丢弃 cur 个问题
bool need_cut(int cur)
{
	//最优性剪枝,当前最大能产生的 不同问卷的对数为
	int max_select = m - cur - 2;
	int max_set = pow(2, m - cur - 1);
	max_select = pow(4, max_select);
	//printf("max_select = %d\n", max_select);
	//
	if (n>max_set)
	{
		if ((max_select*(n / max_set+1)*(n / max_set+1)) < k)
			return true;
	}

	return false;
}

bool judge(int cur)
{
	memset(is_del, 0, sizeof(is_del));
	memset(my_set, 0, sizeof(my_set));
	int count = 0;
	//进行判断
	for (int i = 0; i < cur; i++)
		is_del[P[i]] = 1;
	//获得每个问题的id,并统计集合
	for (int i = 0; i < n; i++)
	{
		id[i] = get_id(i);
		//判断是否重复
		if (my_set[id[i]] == 0)
		{
			exist[count] = id[i];
			count++;
		}
		//统计次数+1
		my_set[id[i]] += 1;
	}
	int ans = 0;
	for (int i = 0; i<count - 1; i++)
		for (int j = i + 1; j < count; j++)
		{
			ans += my_set[exist[i]] * my_set[exist[j]];
			if (ans >= k)
				return true;
		}
	return false;
}

int problem_set = 0;


//枚举选的问题集合，当前枚举的是第 cur 个 元素
void generate_subset(int m,int cur,int mini_index)
{
	//如果需要剪枝则不继续枚举
	//if (need_cut(cur))   return;
	//printf("cur = %d  need_cut return false\n",cur);

	//否则进行判断
	if (!judge(cur)) return;
	problem_set++;

	//if (need_cut(cur))   return;

	//枚举问题集合
	for (int i = mini_index; i < m; i++)
	{
		P[cur] = i;
		//if (need_cut(cur))   return;
		//枚举下一个不选择的子集
		generate_subset(m, cur + 1, i + 1);

	}
	//if (need_cut(cur))   return;
}


void solve()
{
	if (Cn2 < k)
	{
		printf("0\n");
		return;
	}
	problem_set = 0;
	generate_subset(m, 0, 0);
	printf("%d\n", problem_set);
}

int main()
{
	
	freopen("C:\\Users\\lenovo\\Desktop\\test\\in.txt", "r", stdin);
	freopen("C:\\Users\\lenovo\\Desktop\\test\\out.txt", "w", stdout);
	scanf("%d", &T);
	int kcase = 1;
	while (T--)
	{
		scanf("%d%d%d", &n, &m, &k);
		for (int i = 0; i < n; i++)
			scanf("%s", Q[i]);
		Cn2 = n*(n - 1) / 2;
		printf("Case #%d: ", kcase++);
		solve();
	}
	
	return 0;
}

2018百度之星程序设计大赛（资格赛） 调查问卷 HDU6344 解题思路

2018百度之星程序设计大赛（资格赛） 调查问卷 HDU6344 解题思路

题目链接 HDU 6344 调查问卷

1、题目分析

2、细节思路

3、算法设计

4、程序代码

2018百度之星程序设计大赛（资格赛）调查问卷 HDU6344 解题思路

2018百度之星程序设计大赛（资格赛）调查问卷 HDU6344 解题思路